Lesson 22

Agent Mode Development

AI-generated

Learning Objectives

Understand the agent loop (observe-think-act)
Know when to use agent mode vs. chat or completion
Maintain appropriate oversight during agent work
Give effective high-level task descriptions
Recover from agent mistakes

Developer Track: Advanced AI-Assisted Coding

This lesson covers agent mode in AI coding assistants. Agent mode is where AI takes autonomous action: reading files, writing code, running commands, and iterating. This is developer-specific content; non-developers should skip to Unit 6.

Agent mode is powerful and risky. This lesson teaches you to use it effectively while maintaining control.

The Agent Loop: How AI Coding Agents Work

Understanding the agent loop helps you work with it rather than against it.

The Basic Cycle

Observe: Agent reads relevant files, error messages, terminal output
Think: Agent reasons about what to do next
Act: Agent writes code, runs commands, or makes changes
Repeat: Agent observes the results and continues

This loop continues until the task is complete or the agent gets stuck.

What Agents Can Do

Modern AI coding agents like Claude Code can:

Read and understand your entire codebase
Create new files and directories
Edit existing files with surgical precision
Run terminal commands (tests, linters, builds)
Iterate based on error messages
Search and find relevant code
Execute multi-step plans

What Agents Cannot Do (Well)

Understand requirements you did not express
Know your preferences without instruction files
Access external services they are not configured for
Make judgment calls about business requirements
Guarantee correctness of complex logic

The Trust Gradient

Task Type	Agent Reliability	Oversight Needed
Scaffolding/boilerplate	High	Light review
Tests for existing code	High	Review edge cases
Simple refactoring	Medium-High	Review changes
New feature implementation	Medium	Detailed review
Security-sensitive code	Lower	Careful review
Complex business logic	Lower	Verify requirements

When to Use Agent Mode (and When Not To)

Agent mode is not always the right choice. Match the mode to the task.

When Agent Mode Excels

Multi-file changes: Refactoring that touches many files
Repetitive transformations: Apply same pattern across codebase
Scaffolding: Generate boilerplate structure
Test generation: Write tests for existing code
Bug hunting: "Find and fix the bug where X happens"
Documentation: Generate docs from code

When to Prefer Chat or Completion

Learning: When you want to understand, not just get code
Exploration: When you are unsure what you need
Small changes: When editing is faster than explaining
High-stakes logic: When you need to think through each line
Security-critical: When AI errors could create vulnerabilities

The Complexity Threshold

A good rule: if explaining the task in natural language is faster than doing it yourself, use agent mode. If explaining would take longer, just write the code.

Effective Task Descriptions for Agents

How you describe tasks determines agent success. Be clear and structured.

The Good Task Description Template

Goal: What should be different when done?
Scope: What should change and what should not?
Constraints: Any requirements or limitations?
Verification: How will we know it worked?

Examples of Good vs. Bad Descriptions

Bad: "Fix the login bug"

Good: "Users report they cannot log in after password reset. The bug is likely in the password reset flow or session handling. Find where the session is not being properly created after password reset and fix it. Verify by running the auth test suite."

Bad: "Add dark mode"

Good: "Add a dark mode toggle to the settings page. Use CSS custom properties (we already have --color-bg-primary etc. defined in globals.css). The toggle should persist to localStorage. Update all components in src/components/ that have hardcoded colors. Do not change the color values themselves; just ensure they use the CSS variables."

Specifying What NOT to Change

Often as important as what to do:

"Refactor the user service to use the repository pattern. Create interfaces first. Do not change any existing tests; they should still pass. Do not modify the controller layer; keep the existing API signatures."

Requesting Plan Review

For complex tasks, have the agent plan before executing:

"I want to add WebSocket support for real-time notifications. Before writing any code, show me your plan: what files you'll create, what you'll modify, and the rough implementation approach. I'll approve before you proceed."

Human-in-the-Loop: Oversight Without Micromanagement

The goal is appropriate oversight: enough control to catch problems, not so much that you lose agent benefits.

Oversight Strategies

For low-risk tasks: Let the agent run, review diff at the end

For medium-risk tasks: Request plan approval, then let it execute

For high-risk tasks: Step through in stages, approving each phase

Reading Agent Diffs Effectively

When reviewing agent changes:

Start with the file list. Any unexpected files?
Check for deletions or large modifications
Review new code for obvious issues
Run tests before committing

Red Flags to Watch For

Unexpected dependency additions
Changes to files outside stated scope
Removed error handling or validation
Hardcoded values that should be configurable
Test modifications (unless explicitly requested)
Security-sensitive code (auth, crypto, input handling)

Correcting Agent Mistakes

When the agent goes wrong:

Minor issues: "Good progress, but fix these issues: [specific list]. Do not rewrite everything; just address these points."

Wrong direction: "Let's stop. This approach won't work because [reason]. Instead, try [alternative approach]. Start fresh on this task."

Recovery: "The last change broke tests. Read the error output and fix the issue. Do not change anything unrelated to the test failure."

Key Takeaways

Understand the loop: Observe-think-act-repeat is how agents work
Match mode to task: Agent mode for multi-file, repetitive, or investigative work
Write clear descriptions: Goal, scope, constraints, verification
Specify what not to do: Prevents scope creep and unintended changes
Appropriate oversight: Scale review to risk level

Try It Yourself

Try agent mode with this exercise:

Pick a real task in a codebase:

- Easy: Add a new component with tests - Medium: Refactor to extract common logic - Hard: Find and fix a bug you know exists

Write a task description with:

- Clear goal - Explicit scope - What not to change - How to verify

Let the agent plan. Review the plan before approving.
After execution, review all changes. Note:

- What did the agent get right? - What needed correction? - How would you write the task differently next time?

Sources

Claude Code agent documentation: https://docs.anthropic.com/en/docs/claude-code
Research on AI agent oversight: https://arxiv.org/abs/2309.07870
Human-AI collaboration patterns: https://dl.acm.org/doi/10.1145/3544548.3580969