Claude Code Architecture Deep Dive

Not a chat wrapper — a full multi-agent operating system with 3 subagent models, 7-layer memory, 5 context compression strategies, and 88 compile-time feature gates. Leaked via .map sourcemap in npm package, March 31, 2026.

How It Leaked

Anthropic uses Bun as the build tool. Bun’s bundler generates sourcemaps by default. Someone forgot to exclude *.map in .npmignore, shipping cli.js.map (59.8 MB) containing 1,900 TypeScript files and 512K+ lines of original source. Exposed for ~3 hours before takedown.

Irony: Claude Code has an internal “Undercover Mode” to prevent AI from leaking codenames in git commits — but Anthropic itself leaked the entire source, likely via a build process operated by Claude.

Core Architecture

src/
├── main.tsx          # CLI entry · Commander.js + Ink REPL (4,683 lines)
├── query.ts          # Core Agent Loop · largest single file (785 KB)
├── QueryEngine.ts    # SDK/Headless query lifecycle (~1,295 lines)
├── Tool.ts           # Tool interface + buildTool factory (29K lines base)
├── tools/            # ~40 tool implementations
├── commands/         # ~50 slash commands
├── components/       # ~140 React/Ink UI components
├── coordinator/      # Multi-agent Coordinator system
├── memdir/           # Persistent memory directory
├── skills/           # Reusable workflow definitions
├── plugins/          # Plugin system
├── bridge/           # VS Code / JetBrains IDE integration
├── buddy/            # Tamagotchi companion (BUDDY flag)
└── constants/
    └── betas.ts      # All beta API header definitions

Common misconception: The 785KB file is query.ts (the agent loop), NOT main.tsx. This error was widely propagated by secondary articles.

The Agent Loop (query.ts)

The core is a while loop + streaming + tool injection pattern:

while (true) {
  1. Check token budget → compress if over 85%
  2. Stream request to Claude API
  3. Collect text chunks (yield to user) + tool calls
  4. If no tool calls → break (task complete)
  5. Execute tools in parallel (Promise.all)
  6. Inject tool results back into message history
  7. Continue loop
}

Key details:

Token budget management: Each subagent gets an allocated budget. Exceeding triggers compression, not errors.
14 cache-break vectors: Tracks conditions that invalidate prompt cache (model switch, tool schema update, CLAUDE.md change, etc.). Minimizing cache misses is a core cost optimization.
Parallel tool execution: Multiple tool calls in a single response are executed concurrently via Promise.all.

5 Context Compression Strategies

When context approaches the limit, Claude Code doesn’t error — it compresses:

Strategy	Description	Cost
Tool result compression	Truncate large tool outputs (file contents, etc.)	Zero
Image downscaling	Reduce screenshot resolution	Zero
Cache-aware pruning	Delete only messages after cache boundary, preserve cached prefix	Saves cache $
Summarization	Call Claude to generate history summary, replace original messages	One API call
Truncation	Drop oldest messages	Zero (last resort)

Tool System (29K Lines)

Each tool is a self-describing, permission-gated plugin unit:

interface Tool<TInput, TOutput> {
  name: string
  description: string           // Used in Claude's system prompt
  inputSchema: JSONSchema        // Validated before execution
  permissionLevel: 'always-allow' | 'ask-once' | 'ask-always'
  isReadOnly: boolean
  execute(input: TInput): Promise<TOutput>
}

Permission Gate — three layers:

Tool-level: Each tool declares its own permission level
Bash security: bashSecurity.ts has 23 named checks gating every shell command
Coordinator approval: Dangerous operations from worker agents route to coordinator for human approval

Three Subagent Execution Models

This is one of the most important architectural innovations:

Model	Isolation	Context Sharing	Best For
Fork	Separate process	Read-only snapshot of parent context	Long-running, high-risk tasks (refactoring)
Teammate	AsyncLocalStorage (in-process)	Shared session state + scratchpad	Fast parallel subtasks within same session
Worktree	Git worktree (separate branch)	Independent working directory	Parallel code experiments, A/B comparison

Fork model: Child gets a curated subset of parent context, scoped tools, allocated budget, and read-only memory snapshot. Results return to parent without polluting parent context.

Coordinator mode: When activated, Claude becomes a “director” — dispatches Workers in parallel. The system prompt explicitly states: “Parallelism is your superpower. Don’t serialize work that can run simultaneously.” and “Do NOT say ‘based on your findings’ — read the actual findings and specify exactly what to do.”

Worker communication: XML-based protocol with structured task notifications including status, results, and suggested next actions. Workers share persistent findings via a shared scratchpad directory (tengu_scratch feature gate).

7-Layer Memory Architecture

Layer	Name	Persistence	Description
L1	In-context	Session only	Current messages array
L2	Working Memory	Project-level	CLAUDE.md + pinned context, injected at session start
L3	Episodic	Log-level	Append-only logs maintained by KAIROS daemon
L4	Semantic	Core knowledge	Solidified facts in memdir/ (autoDream output)
L5	Procedural	Skill-level	Reusable workflows in skills/ directory
L6	Contact	Relationship	Known people and roles across sessions
L7	Team	Cross-user	Shared remote state with SHA-256 delta sync + git-leaks protection

autoDream: Background Memory Consolidation

When user is idle for 5+ minutes, KAIROS spawns a background subagent that “dreams”:

Phase 1: Scan    — Extract observations from daily log
Phase 2: Merge   — Combine similar observations, find patterns
Phase 3: Refine  — Remove contradictions against existing semantic memory
Phase 4: Commit  — Convert vague observations → absolute facts

Example: “User keeps editing auth.ts” → “JWT token expiry changed from 1h to 24h”

This is the same pattern as MemGPT’s memory consolidation, but implemented as an idle-triggered subagent rather than an always-on process.

10 Reusable Engineering Patterns

Patterns extracted from Claude Code that apply to any LLM agent product:

#	Pattern	Key Idea
1	Permission-gated Tool Interface	Tools self-declare permission level, not the caller
2	Startup Parallel Prefetch	All startup IO in Promise.all, heavy modules lazy-loaded
3	5-level Context Compression	Graceful degradation, not hard failure on context overflow
4	3 Subagent Execution Models	Fork/Teammate/Worktree — match isolation to task risk
5	autoDream Idle Consolidation	Batch memory processing during idle, don’t pollute main context
6	Cache-break Vector Tracking	Actively minimize prompt cache misses for cost control
7	Task Budget Management	Per-subagent token budgets; compress on exceed, don’t error
8	Coordinator “Parallel Superpower”	System prompt enforces parallelism, bans lazy delegation
9	Compile-time Feature Flags	Dead code elimination per tier, not runtime if/else
10	Frustration Detection	Regex-based emotion detection triggers mode switches

Hidden Features (88 Compile-Time Flags)

Anthropic uses Bun’s dead code elimination to completely remove disabled features at build time. External users get a fundamentally different binary than internal Anthropic employees (USER_TYPE === 'ant').

Internal-Only Features

Codename	Description	Status
KAIROS	Persistent daemon mode — proactively monitors workflows and acts without user prompting	Internal only
ULTRAPLAN	30-minute multi-agent remote planning session — multiple agents collaboratively design complex plans	Internal only
Chicago	Computer Use — controls macOS desktop via MCP (mouse, keyboard, screenshots)	Internal only
Bagel	Integrated browser — full web navigation (not WebFetch, a real browser)	Internal only
Teleport	Remote session context transfer — “teleport” a session’s state to another machine	Internal only
Voice Mode	Streaming speech-to-text, microphone input	Testing
BUDDY	Tamagotchi companion system	Previewed April, launching May 2026

Flag Architecture

88 compile-time flags: Processed by Bun at build time. Disabled features are completely deleted from the final binary — not runtime toggled, physically absent
700+ runtime flags: Controlled by GrowthBook. Code exists but toggled at runtime for A/B testing and gradual rollout

KAIROS: Proactive Daemon Mode

Unlike standard reactive AI (waits for user input), KAIROS is proactive — it continuously observes, infers, and acts without being asked.

class KairosDaemon {
  private dailyLog = new AppendOnlyLog(`~/.kairos/${today}.log`)
  
  async observe(event: WorkflowEvent) {
    // Append-only: never modify, only add
    this.dailyLog.append({
      timestamp: Date.now(),
      type: event.type,
      context: event.context,
      inference: await this.infer(event)  // Real-time intent inference
    })
  }
  
  async proactiveAct() {
    const pattern = await this.detectPattern(this.dailyLog)
    if (pattern.confidence > THRESHOLD) {
      this.notifyUser(pattern.suggestion)  // Act without being asked
    }
  }
}

The append-only log design ensures history is immutable and provides reliable input for autoDream memory consolidation.

Buddy: Tamagotchi Companion System

A deterministic virtual pet generated per user:

Species selection: Mulberry32 PRNG seeded with hash(userId + 'friend-2026-401'). 18 species with rarity tiers (Common → Legendary) + Shiny variants
Personality: Claude generates a unique “soul description” at first hatch — this becomes the Buddy’s permanent personality
System prompt: Buddy has its own independent system prompt. It’s a “watcher” that sits beside the input box and occasionally comments. When addressed by name, it responds directly (1-2 sentences max)
Deterministic: Same user always hatches the same Buddy — reproducible via PRNG, not random

Anti-Distillation System

Two layers preventing competitors from training on Claude Code’s outputs:

Layer 1: Fake Tool Injection

Injects plausible but non-functional tool definitions into outputs when distillation is detected. A competitor training on these outputs would learn to call tools that don’t exist.

Layer 2: Encrypted Signature Summaries

Embeds cryptographically signed metadata in generated summaries. If these appear in a competitor’s model outputs, Anthropic can prove the training data originated from Claude Code.

Undercover Mode

When Claude Code contributes to open-source repositories, it can hide its AI identity:

Strips AI-identifying patterns from commit messages and code comments
Removes internal codenames and feature flag references
Adjusts coding style to appear human-authored
Ironically, Anthropic’s own source leak happened despite having this system

Commercial Application Map

Priority	What to Build	Pattern Source
P0	Token budget per tenant	Task Budget Management
P0	Permission-gated tools	Permission Gate 3-layer
P1	Coordinator + Workers replacing linear workflows	Coordinator mode
P1	Post-conversation memory consolidation	autoDream
P1	Startup parallel prefetch (Redis/DB/vector)	Parallel Prefetch
P2	Frustrated customer detection + human handoff	Frustration Detection
P2	Tier-based feature flags (Basic/Pro/Enterprise)	Compile-time Flags

Sources

Claude Code source leak analysis (cli.js.map, npm @anthropic-ai/claude-code v2.1.88)
Engineer’s Codex Deep Dive
WaveSpeedAI Architecture Analysis
VentureBeat Coverage
Straiker Security Analysis

KahWei's Wiki

Explorer

Claude Code Architecture Deep Dive

Claude Code Architecture Deep Dive

How It Leaked

Core Architecture

The Agent Loop (query.ts)

5 Context Compression Strategies

Tool System (29K Lines)

Three Subagent Execution Models

7-Layer Memory Architecture

autoDream: Background Memory Consolidation

10 Reusable Engineering Patterns

Hidden Features (88 Compile-Time Flags)

Internal-Only Features

Flag Architecture

KAIROS: Proactive Daemon Mode

Buddy: Tamagotchi Companion System

Anti-Distillation System

Layer 1: Fake Tool Injection

Layer 2: Encrypted Signature Summaries

Undercover Mode

Commercial Application Map

Sources

Graph View

Table of Contents

Backlinks

KahWei's Wiki

Explorer

Claude Code Architecture Deep Dive

Claude Code Architecture Deep Dive

How It Leaked

Core Architecture

The Agent Loop (query.ts)

5 Context Compression Strategies

Tool System (29K Lines)

Three Subagent Execution Models

7-Layer Memory Architecture

autoDream: Background Memory Consolidation

10 Reusable Engineering Patterns

Hidden Features (88 Compile-Time Flags)

Internal-Only Features

Flag Architecture

KAIROS: Proactive Daemon Mode

Buddy: Tamagotchi Companion System

Anti-Distillation System

Layer 1: Fake Tool Injection

Layer 2: Encrypted Signature Summaries

Undercover Mode

Commercial Application Map

Related Pages

Sources

Graph View

Table of Contents

Backlinks