Deep Dive: How Claude Code Works

Duration: 15 minutes
Prerequisites: Familiarity with LLM agents, tool use / function calling, ReAct-style loops
Learning Objectives:

Understand the single-threaded master loop architecture and why Anthropic chose simplicity over multi-agent swarms
Map out the complete tool ecosystem (reading, editing, execution, planning)
Explain the memory hierarchy: CLAUDE.md files, auto memory, and context compression
Describe the controlled parallelism model with sub-agents and orchestration
Identify design principles that transfer to building your own agent harnesses

1. Introduction: What Claude Code Actually Is (2 min)

Claude Code is Anthropic's agentic coding assistant—not a chatbot you paste code into, but an agent that operates directly in your development environment. It reads your filesystem, writes and edits files, runs shell commands, and iterates autonomously until a task is complete.

Key insight: The agent is the model. Claude Code's architecture is a harness—the scaffolding that gives Claude "hands and eyes" to work with your codebase. The harness doesn't make Claude smart; Claude is already smart. The harness gives Claude a workspace.

This matters because it inverts how most people think about "building agents." You're not building intelligence. You're building an environment for an intelligent model to inhabit.

Framing for the lesson: We'll reverse-engineer Claude Code's architecture to understand the principles that make it effective. These principles generalize beyond coding to any domain where you want an LLM to do extended autonomous work.

2. The Master Loop: Radical Simplicity (3 min)

Architecture Decision: Single-Threaded, Not Multi-Agent

Claude Code runs a single-threaded master loop (internally codenamed "nO"). The loop is deceptively simple:


while model_response contains tool_calls:
    execute_tools(tool_calls)
    append_results_to_history(results)
    model_response = call_model(history)
return model_response  # plain text = done

When Claude produces a response without tool calls, the loop terminates and returns control to the user.

Why Not Multi-Agent Swarms?

Anthropic explicitly chose against the trend toward complex multi-agent orchestration. Their core thesis:

"A simple, single-threaded master loop combined with disciplined tools and planning delivers controllable autonomy."

Trade-offs:

Multi-Agent Swarms	Single Master Loop
Parallel by default	Sequential by default
Complex debugging (which agent failed?)	Transparent debugging (one history)
State synchronization challenges	One flat message history
Emergent coordination bugs	Predictable execution

The single-loop design prioritizes debuggability and reliability over parallelism. You always know exactly what the agent did and in what order.

Real-Time Steering

One crucial production feature: the "h2A" asynchronous dual-buffer queue allows users to inject new instructions mid-task without restarting. This addresses a major pain point—being able to course-correct an autonomous agent without losing context or progress.

3. The Tool Ecosystem (4 min)

Claude Code ships with a small set of highly specialized tools. The minimal toolset is intentional: each tool is tightly scoped, making it easier for the model to select the right one.

Reading & Discovery Tools

Tool	Purpose	Notable Design Choice
View / Read	Read file contents (~2000 lines default)	Chunks large files; Claude decides how much to load
LS	List directory contents	Shows structure without loading content
Glob	Wildcard path matching	Find files by pattern across large repos
GrepTool	Full regex search (mirrors ripgrep)	Not vector search—relies on Claude's regex crafting ability

Design insight: No embeddings, no vector databases. Anthropic bets that Claude's understanding of code structure lets it craft effective regex queries without the operational overhead of maintaining search indices. This is a deliberate simplification that works because the model is capable enough.

Editing Tools

Tool	Purpose
Edit / FileEditTool	Surgical patches via diffs
Write / FileWriteTool	Whole-file creation or replacement
MultiEdit	Batch edits across multiple files
NotebookEditTool	Jupyter notebook manipulation

Edits are displayed as minimal diffs. Every change is tracked for review and potential rollback.

Execution Tools

Tool	Purpose	Safety Measures
Bash	Persistent shell session	Risk classification; confirmation prompts for dangerous ops; filters injection attempts (backticks, shell expansion)
AgentTool	Spawn sub-agents	Depth limits prevent recursive spawning

Bash is the "universal adapter"—if Claude can't do something with a specialized tool, it can often accomplish it via shell commands (git, npm, docker, curl, etc.).

Planning Tools

Tool	Purpose
TodoWrite	Structured JSON task lists with IDs, status, priorities
ThinkTool	Emit reasoning (visible chain-of-thought)
ArchitectTool	Design software architecture without implementing

The TodoWrite tool creates interactive checklists in the UI. After tool calls, the system injects the current TODO state as a system message, preventing Claude from losing track of objectives during long sessions.

4. Memory & Context Management (3 min)

Each Claude Code session starts with a fresh context window. Two mechanisms carry knowledge across sessions:

CLAUDE.md Files (User-Authored)

Markdown files that Claude reads at session start. They provide persistent context: project conventions, build commands, architectural decisions, coding standards.

Hierarchy (all layers combine, more specific overrides on conflict):

~/.claude/CLAUDE.md — Global defaults (all projects)
~/projects/CLAUDE.md — Parent directory rules
./CLAUDE.md — Project-specific
./CLAUDE.local.md — Personal/gitignored
.claude/rules/*.md — Modular topic files

Best practice: Keep CLAUDE.md under 200 lines. It consumes context window tokens. Reference external docs rather than duplicating them.

Auto Memory (Agent-Authored)

Claude can write notes for itself: build commands, debugging insights, code style preferences. These accumulate automatically based on corrections and patterns Claude observes.

Auto memory files live in ~/.claude/projects/<hash>/<session>/. The /memory command shows what's loaded.

Context Compression

When the context window reaches ~92% capacity, the "Compressor wU2" system triggers:

Summarizes the conversation
Moves important information to long-term storage (simple Markdown files)
Continues with compressed context

This is not vector retrieval—just structured summarization and file storage. Another deliberate simplification.

5. Controlled Parallelism: Sub-Agents (2 min)

For tasks requiring exploration or alternative approaches, Claude Code supports sub-agent dispatch via the AgentTool (also called I2A/Task Agent).

Orchestrator–Subagent Pattern


Lead Agent (Orchestrator)
    ├── Spawns Subagent A (search task)
    ├── Spawns Subagent B (review task)
    └── Synthesizes results

Key constraints:

Subagents operate in isolated context windows
Only relevant summaries return to the orchestrator (not full context)
Strict depth limits prevent recursive spawning
Subagents don't communicate directly—all coordination flows through the orchestrator

This is controlled parallelism, not autonomous swarms. The orchestrator maintains oversight and can terminate subagents.

Use cases: Parallel code reviews, searching multiple files simultaneously, exploring alternative implementations.

6. Safety & Permissions (1 min)

Claude Code implements defense in depth:

Layer	Mechanism
Permission prompts	Explicit allow/deny for writes, risky Bash commands, external tools
Command sanitization	Risk classification; blocks injection patterns
Whitelists	Configurable trusted operations
Diff-first workflow	Changes shown as colored diffs; encourages minimal edits
Git as checkpoint	Commits serve as rollback points

The design assumes the user is ultimately in control. Claude asks before doing anything destructive, and the permission system is configurable per-project.

7. Design Principles That Transfer (1 min)

What can we learn from Claude Code's architecture for building our own agents?

The harness is not the agent. Your job is to build the environment; the model provides the intelligence. Don't second-guess the model with elaborate decision trees.
Simplicity scales better than complexity. A single loop with disciplined tools beats a swarm you can't debug.
Give concrete feedback. Linting, test results, diffs—anything that lets the model verify its own work.
Context is finite. Design for compression. Use file storage as extended memory. Don't try to keep everything in the window.
Parallelism needs control. Subagents are powerful but constrained. Depth limits and isolated contexts prevent runaway behavior.

Summary

Claude Code's architecture is deliberately simple:

Single-threaded master loop (not multi-agent)
Small, specialized tool set (grep over embeddings)
Memory via Markdown files (not databases)
Controlled sub-agents with depth limits
Permission system for safety

The lesson: trust the model, engineer the harness. This pattern—one agent loop plus tools plus context management—applies to any domain where you want Claude to do extended autonomous work.

References

Anthropic. (2025). Building agents with the Claude Agent SDK. https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk
Anthropic. (2025). How Claude remembers your project. Claude Code Docs. https://code.claude.com/docs/en/memory
Anthropic. (2025). Tools reference. Claude Code Docs. https://code.claude.com/docs/en/tools-reference
Anthropic. (2025). Equipping agents for the real world with Agent Skills. https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills
ZenML. (2025). Claude Code Agent Architecture: Single-Threaded Master Loop for Autonomous Coding. LLMOps Database. https://www.zenml.io/llmops-database/claude-code-agent-architecture-single-threaded-master-loop-for-autonomous-coding