Products claude model version product assistant analysis

Claude Code Architecture Explained: Agent Loop, Tool System, and Permission Model (Rust Rewrite Analysis)

DEV Communityby brooks wilsonApril 2, 20266 min read3 views

🧒Explain Like I'm 5Simple language

Hey there, little explorer! Imagine Claude is a super-smart robot friend who helps you with puzzles.

Someone accidentally showed everyone all of Claude's secret instructions, like a giant cookbook with a million pages! It was too much to read.

But then, some clever friends made a much smaller cookbook, with only a few pages, that still lets Claude do all his cool tricks! It's like they found the magic recipe that really matters.

This little cookbook shows us how Claude thinks (that's his "agent loop"), how he uses his toy tools to build things, and how he knows what he's allowed to do. It's like seeing the simple, secret heart of our robot friend! Isn't that neat?

<h2> Claude Code Deep Dive (Part 1): Architecture Overview and the Core Agent Loop </h2> <p>Claude Code’s leaked source code weighs in at over <strong>510,000 lines of TypeScript</strong>—far too large to analyze directly.</p> <p>Interestingly, a community-driven Rust rewrite reduced that complexity to around <strong>20,000 lines</strong>, while still preserving the core functionality.</p> <p>Starting from this simplified version makes one thing much clearer:</p> <blockquote> <p>What does an AI agent system <em>actually need</em> to work?</p> </blockquote> <h2> Why Start with the Rust Rewrite? </h2> <p>On March 31, 2026, Claude Code’s full source was unintentionally exposed due to an npm packaging mistake.</p> <p>The package <code>@anthropic-ai/claude-code v2.1.88</code> included a <strong

Claude Code Deep Dive (Part 1): Architecture Overview and the Core Agent Loop

Claude Code’s leaked source code weighs in at over 510,000 lines of TypeScript—far too large to analyze directly.

Interestingly, a community-driven Rust rewrite reduced that complexity to around 20,000 lines, while still preserving the core functionality.

Starting from this simplified version makes one thing much clearer:

What does an AI agent system actually need to work?

Why Start with the Rust Rewrite?

On March 31, 2026, Claude Code’s full source was unintentionally exposed due to an npm packaging mistake.

The package @anthropic-ai/claude-code v2.1.88 included a 59.8MB source map file, which allowed anyone to reconstruct the original TypeScript codebase.

To clarify:

The official GitHub repo always existed
But it only contained compiled bundles and documentation
The readable source code was not normally accessible

The Problem with the Original Codebase

Most analyses focused on the leaked TypeScript code:

510K+ lines
QueryEngine alone: ~46K lines
40+ tools
Complex plugin system

The result: too much detail, not enough clarity.

Why the Rust Version Is More Useful

Shortly after the leak:

Developer Sigrid Jin (instructkr community)
First built a Python clean-room version
Then pushed a Rust implementation (claw-code)

👉 Project overview: claw-code

This version:

~20K lines of Rust
Retains core functionality:

Agent loop Tool system Permission control Prompt system Session management MCP protocol Sub-agents

The key benefit:

Rewriting forces simplification. What remains is what actually matters.

Architecture Overview: A 6-Module System

The Rust implementation is structured into six modules:

claw-code/ ├── runtime/ # Core runtime: loop, permissions, config, session, prompt ├── api/ # LLM client, SSE streaming, OAuth ├── tools/ # Tool registry and execution ├── commands/ # Slash commands (/help, /cost) ├── compat-harness/ # TS → Rust compatibility layer └── rusty-claude-cli/ # CLI, REPL, terminal rendering

claw-code/ ├── runtime/ # Core runtime: loop, permissions, config, session, prompt ├── api/ # LLM client, SSE streaming, OAuth ├── tools/ # Tool registry and execution ├── commands/ # Slash commands (/help, /cost) ├── compat-harness/ # TS → Rust compatibility layer └── rusty-claude-cli/ # CLI, REPL, terminal rendering

Enter fullscreen mode

Exit fullscreen mode

These modules form a layered architecture:

CLI / REPL (User Interaction) ───────────────────────────── MCP Protocol · Sub-agents (Extension Layer) ───────────────────────────── API Client · Session Management (Communication Layer) ───────────────────────────── System Prompt · Config (Context Layer) ───────────────────────────── Agent Loop · Tools · Permissions (Core Layer)

CLI / REPL (User Interaction) ───────────────────────────── MCP Protocol · Sub-agents (Extension Layer) ───────────────────────────── API Client · Session Management (Communication Layer) ───────────────────────────── System Prompt · Config (Context Layer) ───────────────────────────── Agent Loop · Tools · Permissions (Core Layer)

Enter fullscreen mode

Exit fullscreen mode

A Key Design Decision

The runtime module defines interfaces, not implementations:

ApiClient → LLM communication
ToolExecutor → tool execution

Concrete implementations live at the top (CLI layer).

This enables:

Mock implementations for testing
Real implementations for production
Zero changes to core logic

Testability is built into the architecture—not added later.

The Core: An 88-Line Agent Loop

If you only read one file, read this:

conversation.rs

The entire agent loop is implemented in ~88 lines.

Runtime State: Simpler Than Expected

AgentRuntime {  session # message array (the only state)  api_client # LLM interface  tool_executor # tool execution  permission_policy # access control  system_prompt  max_iterations  usage_tracker }

AgentRuntime {  session # message array (the only state)  api_client # LLM interface  tool_executor # tool execution  permission_policy # access control  system_prompt  max_iterations  usage_tracker }

Enter fullscreen mode

Exit fullscreen mode

The surprising part:

The only state is a message array.

No explicit state machine. No workflow graph.

The Core Loop: run_turn()

Here’s the simplified logic:

python

def run_turn(user_input):
 session.messages.append(UserMessage(user_input))


`while True:
 if iterations > max_iterations:
 raise Error("Max iterations exceeded")

 response = api_client.stream(system_prompt, session.messages)

 assistant_message = parse_response(response)
 session.messages.append(assistant_message)

 tool_calls = extract_tool_uses(assistant_message)

 if not tool_calls:
 break

 for tool_name, input in tool_calls:
 permission = authorize(tool_name, input)

 if permission == Allow:
 result = tool_executor.execute(tool_name, input)
 session.messages.append(ToolResult(result))
 else:
 session.messages.append(
 ToolResult(deny_reason, is_error=True)
 )`


Enter fullscreen mode
 


 Exit fullscreen mode


`---

## A Concrete Example

User asks:

> “What is 2 + 2?”

Execution flow:

| Step | Message State | Description |
| ------ | -------------------------- | ------------------------ |
| Start | `[User("2+2")]` | User input |
| API #1 | + Assistant (calls tool) | Model decides to compute |
| Tool | + ToolResult("4") | Tool executes |
| API #2 | + Assistant("Answer is 4") | Final answer |
| End | Loop exits | No more tool calls |

Termination condition:

> The model decides to stop calling tools.

---

## Key Design Insight #1: Messages = State

Instead of managing state explicitly:

* The system stores everything as messages
* The full state is reconstructible from history

Benefits:

* Easy persistence (save session)
* Easy replay (debugging)
* Easy compression (context trimming)

> One append-only structure solves multiple problems.

---

## Key Design Insight #2: Errors Are Feedback

When a tool is denied:

* The system does **not** crash
* It returns an error as a `ToolResult`

This is fed back to the model.

Result:

* The model adapts
* Chooses alternative strategies

> Failure becomes part of the reasoning loop.

---

## Tool System: 18 Tools, One Pattern

The Rust version implements 18 built-in tools in a unified structure.

---

### Three Layers


```plaintext
1. Tool Registry → defines schema and permissions
2. Dispatcher → routes tool calls
3. Implementation → executes logic`


Enter fullscreen mode
 


 Exit fullscreen mode


### Tool Specification


```json id="i9j1sx"
{
 "name": "bash",
 "description": "Execute shell commands",
 "input_schema": {
 "command": "string",
 "timeout": "number?"
 },
 "required_permission": "DangerFullAccess"
}


`This schema is passed directly to the LLM.

---

### Why JSON Schema Matters

* Decouples LLM from implementation
* Enables language-agnostic tools
* Standardizes interfaces

> Schema = contract

---

### Dispatcher Pattern


```python id="5g5syv"
def execute_tool(name, input):
 match name:
 "bash" -> run_bash()
 "read_file" -> run_read()
 ...`


Enter fullscreen mode
 


 Exit fullscreen mode


Adding a tool:


- Define input struct

- Implement logic

- Add one dispatch line


### Sub-Agent Design


Sub-agents reuse the same runtime:


```python id="5y9zsl"
runtime = AgentRuntime(
 session = new_session,
 tool_executor = restricted_tools,
 permission = high,
 prompter = None
)


`Key constraint:

* Sub-agents cannot spawn sub-agents

This prevents recursion loops.

---

## Permission System: Minimal but Complete

The system uses **5 permission levels**:

* ReadOnly
* WorkspaceWrite
* DangerFullAccess
* Prompt
* Allow

---

### Core Logic


```python id="9t9ahj"
if current >= required:
 allow
elif one_level_gap:
 ask_user
else:
 deny`


Enter fullscreen mode
 


 Exit fullscreen mode


### Design Insight: Gradual Escalation


Instead of:


- All-or-nothing access


It uses:


> Controlled escalation


- Small gap → ask user

- Large gap → deny


### Sub-Agent Safety Model


Sub-agents:


- Have high permission

- But no user prompt interface


Result:


- Allowed within scope

- Automatically blocked outside


> Two mechanisms combine into precise control.


## Part 1 Summary


Claude Code’s core reduces to three components:


`Agent Loop → execution engine
Tool System → action layer
Permissions → safety control`


Enter fullscreen mode
 


 Exit fullscreen mode


Key principles:


- Messages are the only state

- LLM decides when to stop

- Tools are schema-driven

- Errors are part of reasoning

- Permissions are incremental


## Final Thought


After stripping away 500K lines of code, what remains is surprisingly small:


> A loop, a tool interface, and a permission system.


That’s enough to build a functional AI agent.


But making it robust, scalable, and safe—that’s where the real complexity begins.


## Next Part


Claude Code Deep Dive (Part 2): Context Engineering and Design Patterns


- Prompt construction

- Config merging

- Context compression

- Practical design takeaways


## References


- Claw Code (Rust rewrite): https://github.com/instructkr/claw-code

- Project site: https://claw-code.codes/

- Claude Code official repo: https://github.com/anthropics/claude-code

def run_turn(user_input):
 session.messages.append(UserMessage(user_input))


`while True:
 if iterations > max_iterations:
 raise Error("Max iterations exceeded")

 response = api_client.stream(system_prompt, session.messages)

 assistant_message = parse_response(response)
 session.messages.append(assistant_message)

 tool_calls = extract_tool_uses(assistant_message)

 if not tool_calls:
 break

 for tool_name, input in tool_calls:
 permission = authorize(tool_name, input)

 if permission == Allow:
 result = tool_executor.execute(tool_name, input)
 session.messages.append(ToolResult(result))
 else:
 session.messages.append(
 ToolResult(deny_reason, is_error=True)
 )`


Enter fullscreen mode
 


 Exit fullscreen mode


`---

## A Concrete Example

User asks:

> “What is 2 + 2?”

Execution flow:

| Step | Message State | Description |
| ------ | -------------------------- | ------------------------ |
| Start | `[User("2+2")]` | User input |
| API #1 | + Assistant (calls tool) | Model decides to compute |
| Tool | + ToolResult("4") | Tool executes |
| API #2 | + Assistant("Answer is 4") | Final answer |
| End | Loop exits | No more tool calls |

Termination condition:

> The model decides to stop calling tools.

---

## Key Design Insight #1: Messages = State

Instead of managing state explicitly:

* The system stores everything as messages
* The full state is reconstructible from history

Benefits:

* Easy persistence (save session)
* Easy replay (debugging)
* Easy compression (context trimming)

> One append-only structure solves multiple problems.

---

## Key Design Insight #2: Errors Are Feedback

When a tool is denied:

* The system does **not** crash
* It returns an error as a `ToolResult`

This is fed back to the model.

Result:

* The model adapts
* Chooses alternative strategies

> Failure becomes part of the reasoning loop.

---

## Tool System: 18 Tools, One Pattern

The Rust version implements 18 built-in tools in a unified structure.

---

### Three Layers


```plaintext
1. Tool Registry → defines schema and permissions
2. Dispatcher → routes tool calls
3. Implementation → executes logic`


Enter fullscreen mode
 


 Exit fullscreen mode


### Tool Specification


```json id="i9j1sx"
{
 "name": "bash",
 "description": "Execute shell commands",
 "input_schema": {
 "command": "string",
 "timeout": "number?"
 },
 "required_permission": "DangerFullAccess"
}


`This schema is passed directly to the LLM.

---

### Why JSON Schema Matters

* Decouples LLM from implementation
* Enables language-agnostic tools
* Standardizes interfaces

> Schema = contract

---

### Dispatcher Pattern


```python id="5g5syv"
def execute_tool(name, input):
 match name:
 "bash" -> run_bash()
 "read_file" -> run_read()
 ...`


Enter fullscreen mode
 


 Exit fullscreen mode


Adding a tool:


- Define input struct

- Implement logic

- Add one dispatch line


### Sub-Agent Design


Sub-agents reuse the same runtime:


```python id="5y9zsl"
runtime = AgentRuntime(
 session = new_session,
 tool_executor = restricted_tools,
 permission = high,
 prompter = None
)


`Key constraint:

* Sub-agents cannot spawn sub-agents

This prevents recursion loops.

---

## Permission System: Minimal but Complete

The system uses **5 permission levels**:

* ReadOnly
* WorkspaceWrite
* DangerFullAccess
* Prompt
* Allow

---

### Core Logic


```python id="9t9ahj"
if current >= required:
 allow
elif one_level_gap:
 ask_user
else:
 deny`


Enter fullscreen mode
 


 Exit fullscreen mode


### Design Insight: Gradual Escalation


Instead of:


- All-or-nothing access


It uses:


> Controlled escalation


- Small gap → ask user

- Large gap → deny


### Sub-Agent Safety Model


Sub-agents:


- Have high permission

- But no user prompt interface


Result:


- Allowed within scope

- Automatically blocked outside


> Two mechanisms combine into precise control.


## Part 1 Summary


Claude Code’s core reduces to three components:


`Agent Loop → execution engine
Tool System → action layer
Permissions → safety control`


Enter fullscreen mode
 


 Exit fullscreen mode


Key principles:


- Messages are the only state

- LLM decides when to stop

- Tools are schema-driven

- Errors are part of reasoning

- Permissions are incremental


## Final Thought


After stripping away 500K lines of code, what remains is surprisingly small:


> A loop, a tool interface, and a permission system.


That’s enough to build a functional AI agent.


But making it robust, scalable, and safe—that’s where the real complexity begins.


## Next Part


Claude Code Deep Dive (Part 2): Context Engineering and Design Patterns


- Prompt construction

- Config merging

- Context compression

- Practical design takeaways


## References


- Claw Code (Rust rewrite): https://github.com/instructkr/claw-code

- Project site: https://claw-code.codes/

- Claude Code official repo: https://github.com/anthropics/claude-code

Original source

DEV Community

https://dev.to/brooks_wilson_36fbefbbae4/claude-code-architecture-explained-agent-loop-tool-system-and-permission-model-rust-rewrite-41b2

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

claudemodelversion

Models

Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT - WSJ

Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT WSJ

Google News: ChatGPT

1m6 days ago

Models

Anthropic Races to Contain Leak of Code Behind Claude AI Agent - WSJ

Anthropic Races to Contain Leak of Code Behind Claude AI Agent WSJ

Google News: Claude

1m4 days ago

ProductsRecent

Apple at 50: Three products that changed how we live - and three that really didn't

On the tech giant's 50th year, we ask analysts to give their top three Apple successes and misses

BBC Technology

1mabout 13 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 162 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Products

ProductsRecent

OpenAI’s Top Executive Fidji Simo to Take Medical Leave From Company - WSJ

OpenAI’s Top Executive Fidji Simo to Take Medical Leave From Company WSJ

Google News: OpenAI

1m1 day ago

ProductsRecent

Apple at 50: Three products that changed how we live - and three that really didn't

On the tech giant's 50th year, we ask analysts to give their top three Apple successes and misses

BBC Technology

1mabout 13 hours ago

ProductsLive

Cortex Code in Snowflake: How to Use It Without Burning Credits

Snowflake Cortex Code (CoCo) is like an AI assistant inside Snowsight (and CLI also). You can ask it to write SQL, create dbt models, explore data, help in ML work, and even do some admin tasks. But one thing people don’t realise early — this tool is powerful, but also costly if used wrongly. Bad prompts → more tokens → more credits → surprise bill. Prompt Engineering (this directly impacts cost) CoCo works on token consumption. what you type → counted 2. what it replies → counted If your prompt is vague → more tool calls → more cost. Example: Bad: Help me with my data Good: Create staging model for RAW.SALES.ORDERS with not_null on ORDER_ID Best Practices: Use full table names 2. Be clear about output 3. Keep prompts small 4. Provide business logic upfront 5. Use AGENTS.md for consistency

Towards AI

3mabout 1 hour ago

ProductsLive

The Stack Nobody Recommended

The most common question I got after publishing Part 1 was some variation of "why did you pick X instead of Y?" So this post is about that. Every major technology choice, what I actually considered, where I was right, and where I got lucky. I'll be upfront: some of these were informed decisions. Some were "I already know this tool, and I need to move fast." Both are valid, but they lead to different trade-offs down the line. The Backend: FastAPI I come from JavaScript and TypeScript. Years of React on the frontend, Express and Fastify on the backend. When I decided this project would be Python, because that's where the AI/ML ecosystem lives, I needed something that didn't feel foreign. FastAPI clicked immediately. The async/await model, the decorator-based routing, and type hints that actu

DEV Community

9mabout 1 hour ago