Deep Dive: How Claude Code Works

Duration: 15 minutes
Prerequisites: Familiarity with LLM agents, tool use / function calling, ReAct-style loops
Learning Objectives:

  1. Understand the single-threaded master loop architecture and why Anthropic chose simplicity over multi-agent swarms

  2. Map out the complete tool ecosystem (reading, editing, execution, planning)

  3. Explain the memory hierarchy: CLAUDE.md files, auto memory, and context compression

  4. Describe the controlled parallelism model with sub-agents and orchestration

  5. Identify design principles that transfer to building your own agent harnesses


1. Introduction: What Claude Code Actually Is (2 min)

Claude Code is Anthropic's agentic coding assistant—not a chatbot you paste code into, but an agent that operates directly in your development environment. It reads your filesystem, writes and edits files, runs shell commands, and iterates autonomously until a task is complete.

Key insight: The agent is the model. Claude Code's architecture is a harness—the scaffolding that gives Claude "hands and eyes" to work with your codebase. The harness doesn't make Claude smart; Claude is already smart. The harness gives Claude a workspace.

This matters because it inverts how most people think about "building agents." You're not building intelligence. You're building an environment for an intelligent model to inhabit.

Framing for the lesson: We'll reverse-engineer Claude Code's architecture to understand the principles that make it effective. These principles generalize beyond coding to any domain where you want an LLM to do extended autonomous work.


2. The Master Loop: Radical Simplicity (3 min)

Architecture Decision: Single-Threaded, Not Multi-Agent

Claude Code runs a single-threaded master loop (internally codenamed "nO"). The loop is deceptively simple:

When Claude produces a response without tool calls, the loop terminates and returns control to the user.

Why Not Multi-Agent Swarms?

Anthropic explicitly chose against the trend toward complex multi-agent orchestration. Their core thesis:

"A simple, single-threaded master loop combined with disciplined tools and planning delivers controllable autonomy."

Trade-offs:

Multi-Agent SwarmsSingle Master Loop
Parallel by defaultSequential by default
Complex debugging (which agent failed?)Transparent debugging (one history)
State synchronization challengesOne flat message history
Emergent coordination bugsPredictable execution

The single-loop design prioritizes debuggability and reliability over parallelism. You always know exactly what the agent did and in what order.

Real-Time Steering

One crucial production feature: the "h2A" asynchronous dual-buffer queue allows users to inject new instructions mid-task without restarting. This addresses a major pain point—being able to course-correct an autonomous agent without losing context or progress.


3. The Tool Ecosystem (4 min)

Claude Code ships with a small set of highly specialized tools. The minimal toolset is intentional: each tool is tightly scoped, making it easier for the model to select the right one.

Reading & Discovery Tools

ToolPurposeNotable Design Choice
View / ReadRead file contents (~2000 lines default)Chunks large files; Claude decides how much to load
LSList directory contentsShows structure without loading content
GlobWildcard path matchingFind files by pattern across large repos
GrepToolFull regex search (mirrors ripgrep)Not vector search—relies on Claude's regex crafting ability

Design insight: No embeddings, no vector databases. Anthropic bets that Claude's understanding of code structure lets it craft effective regex queries without the operational overhead of maintaining search indices. This is a deliberate simplification that works because the model is capable enough.

Editing Tools

ToolPurpose
Edit / FileEditToolSurgical patches via diffs
Write / FileWriteToolWhole-file creation or replacement
MultiEditBatch edits across multiple files
NotebookEditToolJupyter notebook manipulation

Edits are displayed as minimal diffs. Every change is tracked for review and potential rollback.

Execution Tools

ToolPurposeSafety Measures
BashPersistent shell sessionRisk classification; confirmation prompts for dangerous ops; filters injection attempts (backticks, shell expansion)
AgentToolSpawn sub-agentsDepth limits prevent recursive spawning

Bash is the "universal adapter"—if Claude can't do something with a specialized tool, it can often accomplish it via shell commands (git, npm, docker, curl, etc.).

Planning Tools

ToolPurpose
TodoWriteStructured JSON task lists with IDs, status, priorities
ThinkToolEmit reasoning (visible chain-of-thought)
ArchitectToolDesign software architecture without implementing

The TodoWrite tool creates interactive checklists in the UI. After tool calls, the system injects the current TODO state as a system message, preventing Claude from losing track of objectives during long sessions.


4. Memory & Context Management (3 min)

Each Claude Code session starts with a fresh context window. Two mechanisms carry knowledge across sessions:

CLAUDE.md Files (User-Authored)

Markdown files that Claude reads at session start. They provide persistent context: project conventions, build commands, architectural decisions, coding standards.

Hierarchy (all layers combine, more specific overrides on conflict):

  1. ~/.claude/CLAUDE.md — Global defaults (all projects)

  2. ~/projects/CLAUDE.md — Parent directory rules

  3. ./CLAUDE.md — Project-specific

  4. ./CLAUDE.local.md — Personal/gitignored

  5. .claude/rules/*.md — Modular topic files

Best practice: Keep CLAUDE.md under 200 lines. It consumes context window tokens. Reference external docs rather than duplicating them.

Auto Memory (Agent-Authored)

Claude can write notes for itself: build commands, debugging insights, code style preferences. These accumulate automatically based on corrections and patterns Claude observes.

Auto memory files live in ~/.claude/projects/<hash>/<session>/. The /memory command shows what's loaded.

Context Compression

When the context window reaches ~92% capacity, the "Compressor wU2" system triggers:

  1. Summarizes the conversation

  2. Moves important information to long-term storage (simple Markdown files)

  3. Continues with compressed context

This is not vector retrieval—just structured summarization and file storage. Another deliberate simplification.


5. Controlled Parallelism: Sub-Agents (2 min)

For tasks requiring exploration or alternative approaches, Claude Code supports sub-agent dispatch via the AgentTool (also called I2A/Task Agent).

Orchestrator–Subagent Pattern

Key constraints:

This is controlled parallelism, not autonomous swarms. The orchestrator maintains oversight and can terminate subagents.

Use cases: Parallel code reviews, searching multiple files simultaneously, exploring alternative implementations.


6. Safety & Permissions (1 min)

Claude Code implements defense in depth:

LayerMechanism
Permission promptsExplicit allow/deny for writes, risky Bash commands, external tools
Command sanitizationRisk classification; blocks injection patterns
WhitelistsConfigurable trusted operations
Diff-first workflowChanges shown as colored diffs; encourages minimal edits
Git as checkpointCommits serve as rollback points

The design assumes the user is ultimately in control. Claude asks before doing anything destructive, and the permission system is configurable per-project.


7. Design Principles That Transfer (1 min)

What can we learn from Claude Code's architecture for building our own agents?

  1. The harness is not the agent. Your job is to build the environment; the model provides the intelligence. Don't second-guess the model with elaborate decision trees.

  2. Simplicity scales better than complexity. A single loop with disciplined tools beats a swarm you can't debug.

  3. Give concrete feedback. Linting, test results, diffs—anything that lets the model verify its own work.

  4. Context is finite. Design for compression. Use file storage as extended memory. Don't try to keep everything in the window.

  5. Parallelism needs control. Subagents are powerful but constrained. Depth limits and isolated contexts prevent runaway behavior.


Summary

Claude Code's architecture is deliberately simple:

The lesson: trust the model, engineer the harness. This pattern—one agent loop plus tools plus context management—applies to any domain where you want Claude to do extended autonomous work.


References