Duration: 15 minutes
Prerequisites: Familiarity with LLM agents, tool use / function calling, ReAct-style loops
Learning Objectives:
Understand the single-threaded master loop architecture and why Anthropic chose simplicity over multi-agent swarms
Map out the complete tool ecosystem (reading, editing, execution, planning)
Explain the memory hierarchy: CLAUDE.md files, auto memory, and context compression
Describe the controlled parallelism model with sub-agents and orchestration
Identify design principles that transfer to building your own agent harnesses
Claude Code is Anthropic's agentic coding assistant—not a chatbot you paste code into, but an agent that operates directly in your development environment. It reads your filesystem, writes and edits files, runs shell commands, and iterates autonomously until a task is complete.
Key insight: The agent is the model. Claude Code's architecture is a harness—the scaffolding that gives Claude "hands and eyes" to work with your codebase. The harness doesn't make Claude smart; Claude is already smart. The harness gives Claude a workspace.
This matters because it inverts how most people think about "building agents." You're not building intelligence. You're building an environment for an intelligent model to inhabit.
Framing for the lesson: We'll reverse-engineer Claude Code's architecture to understand the principles that make it effective. These principles generalize beyond coding to any domain where you want an LLM to do extended autonomous work.
Claude Code runs a single-threaded master loop (internally codenamed "nO"). The loop is deceptively simple:
while model_response contains tool_calls:execute_tools(tool_calls)append_results_to_history(results)model_response = call_model(history)return model_response # plain text = done
When Claude produces a response without tool calls, the loop terminates and returns control to the user.
Anthropic explicitly chose against the trend toward complex multi-agent orchestration. Their core thesis:
"A simple, single-threaded master loop combined with disciplined tools and planning delivers controllable autonomy."
Trade-offs:
| Multi-Agent Swarms | Single Master Loop |
|---|---|
| Parallel by default | Sequential by default |
| Complex debugging (which agent failed?) | Transparent debugging (one history) |
| State synchronization challenges | One flat message history |
| Emergent coordination bugs | Predictable execution |
The single-loop design prioritizes debuggability and reliability over parallelism. You always know exactly what the agent did and in what order.
One crucial production feature: the "h2A" asynchronous dual-buffer queue allows users to inject new instructions mid-task without restarting. This addresses a major pain point—being able to course-correct an autonomous agent without losing context or progress.
Claude Code ships with a small set of highly specialized tools. The minimal toolset is intentional: each tool is tightly scoped, making it easier for the model to select the right one.
| Tool | Purpose | Notable Design Choice |
|---|---|---|
| View / Read | Read file contents (~2000 lines default) | Chunks large files; Claude decides how much to load |
| LS | List directory contents | Shows structure without loading content |
| Glob | Wildcard path matching | Find files by pattern across large repos |
| GrepTool | Full regex search (mirrors ripgrep) | Not vector search—relies on Claude's regex crafting ability |
Design insight: No embeddings, no vector databases. Anthropic bets that Claude's understanding of code structure lets it craft effective regex queries without the operational overhead of maintaining search indices. This is a deliberate simplification that works because the model is capable enough.
| Tool | Purpose |
|---|---|
| Edit / FileEditTool | Surgical patches via diffs |
| Write / FileWriteTool | Whole-file creation or replacement |
| MultiEdit | Batch edits across multiple files |
| NotebookEditTool | Jupyter notebook manipulation |
Edits are displayed as minimal diffs. Every change is tracked for review and potential rollback.
| Tool | Purpose | Safety Measures |
|---|---|---|
| Bash | Persistent shell session | Risk classification; confirmation prompts for dangerous ops; filters injection attempts (backticks, shell expansion) |
| AgentTool | Spawn sub-agents | Depth limits prevent recursive spawning |
Bash is the "universal adapter"—if Claude can't do something with a specialized tool, it can often accomplish it via shell commands (git, npm, docker, curl, etc.).
| Tool | Purpose |
|---|---|
| TodoWrite | Structured JSON task lists with IDs, status, priorities |
| ThinkTool | Emit reasoning (visible chain-of-thought) |
| ArchitectTool | Design software architecture without implementing |
The TodoWrite tool creates interactive checklists in the UI. After tool calls, the system injects the current TODO state as a system message, preventing Claude from losing track of objectives during long sessions.
Each Claude Code session starts with a fresh context window. Two mechanisms carry knowledge across sessions:
Markdown files that Claude reads at session start. They provide persistent context: project conventions, build commands, architectural decisions, coding standards.
Hierarchy (all layers combine, more specific overrides on conflict):
~/.claude/CLAUDE.md — Global defaults (all projects)
~/projects/CLAUDE.md — Parent directory rules
./CLAUDE.md — Project-specific
./CLAUDE.local.md — Personal/gitignored
.claude/rules/*.md — Modular topic files
Best practice: Keep CLAUDE.md under 200 lines. It consumes context window tokens. Reference external docs rather than duplicating them.
Claude can write notes for itself: build commands, debugging insights, code style preferences. These accumulate automatically based on corrections and patterns Claude observes.
Auto memory files live in ~/.claude/projects/<hash>/<session>/. The /memory command shows what's loaded.
When the context window reaches ~92% capacity, the "Compressor wU2" system triggers:
Summarizes the conversation
Moves important information to long-term storage (simple Markdown files)
Continues with compressed context
This is not vector retrieval—just structured summarization and file storage. Another deliberate simplification.
For tasks requiring exploration or alternative approaches, Claude Code supports sub-agent dispatch via the AgentTool (also called I2A/Task Agent).
Lead Agent (Orchestrator)├── Spawns Subagent A (search task)├── Spawns Subagent B (review task)└── Synthesizes results
Key constraints:
Subagents operate in isolated context windows
Only relevant summaries return to the orchestrator (not full context)
Strict depth limits prevent recursive spawning
Subagents don't communicate directly—all coordination flows through the orchestrator
This is controlled parallelism, not autonomous swarms. The orchestrator maintains oversight and can terminate subagents.
Use cases: Parallel code reviews, searching multiple files simultaneously, exploring alternative implementations.
Claude Code implements defense in depth:
| Layer | Mechanism |
|---|---|
| Permission prompts | Explicit allow/deny for writes, risky Bash commands, external tools |
| Command sanitization | Risk classification; blocks injection patterns |
| Whitelists | Configurable trusted operations |
| Diff-first workflow | Changes shown as colored diffs; encourages minimal edits |
| Git as checkpoint | Commits serve as rollback points |
The design assumes the user is ultimately in control. Claude asks before doing anything destructive, and the permission system is configurable per-project.
What can we learn from Claude Code's architecture for building our own agents?
The harness is not the agent. Your job is to build the environment; the model provides the intelligence. Don't second-guess the model with elaborate decision trees.
Simplicity scales better than complexity. A single loop with disciplined tools beats a swarm you can't debug.
Give concrete feedback. Linting, test results, diffs—anything that lets the model verify its own work.
Context is finite. Design for compression. Use file storage as extended memory. Don't try to keep everything in the window.
Parallelism needs control. Subagents are powerful but constrained. Depth limits and isolated contexts prevent runaway behavior.
Claude Code's architecture is deliberately simple:
Single-threaded master loop (not multi-agent)
Small, specialized tool set (grep over embeddings)
Memory via Markdown files (not databases)
Controlled sub-agents with depth limits
Permission system for safety
The lesson: trust the model, engineer the harness. This pattern—one agent loop plus tools plus context management—applies to any domain where you want Claude to do extended autonomous work.
Anthropic. (2025). Building agents with the Claude Agent SDK. https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk
Anthropic. (2025). How Claude remembers your project. Claude Code Docs. https://code.claude.com/docs/en/memory
Anthropic. (2025). Tools reference. Claude Code Docs. https://code.claude.com/docs/en/tools-reference
Anthropic. (2025). Equipping agents for the real world with Agent Skills. https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills
ZenML. (2025). Claude Code Agent Architecture: Single-Threaded Master Loop for Autonomous Coding. LLMOps Database. https://www.zenml.io/llmops-database/claude-code-agent-architecture-single-threaded-master-loop-for-autonomous-coding