Claude Code memory: how to survive a 200k context window filling up
Claude Code memory: how to survive a 200k context window filling up If you've used Claude Code for more than a few hours on a big project, you've hit this wall. You're in the middle of a refactor. Claude is tracking 15 files, your CLAUDE.md, the conversation history, tool call outputs. Then it slows down. Responses get shorter. It starts forgetting things you told it an hour ago. You're not imagining it. Claude Code's context window is filling up — and there's a specific way to handle it. What's actually consuming your context ClaudeCode tracks several layers of context simultaneously: ┌─────────────────────────────────────┐ │ System prompt (CLAUDE.md) ~2k │ │ Project context (settings) ~1k │ │ Conversation history fills │ │ Tool call results large │ │ File contents (read_file) large │ │ A
Claude Code memory: how to survive a 200k context window filling up
If you've used Claude Code for more than a few hours on a big project, you've hit this wall.
You're in the middle of a refactor. Claude is tracking 15 files, your CLAUDE.md, the conversation history, tool call outputs. Then it slows down. Responses get shorter. It starts forgetting things you told it an hour ago.
You're not imagining it. Claude Code's context window is filling up — and there's a specific way to handle it.
What's actually consuming your context
ClaudeCode tracks several layers of context simultaneously:
┌─────────────────────────────────────┐ │ System prompt (CLAUDE.md) ~2k │ │ Project context (settings) ~1k │ │ Conversation history fills │ │ Tool call results large │ │ File contents (read_file) large │ │ Available: 200k total │ └─────────────────────────────────────┘┌─────────────────────────────────────┐ │ System prompt (CLAUDE.md) ~2k │ │ Project context (settings) ~1k │ │ Conversation history fills │ │ Tool call results large │ │ File contents (read_file) large │ │ Available: 200k total │ └─────────────────────────────────────┘Enter fullscreen mode
Exit fullscreen mode
The biggest culprits:
-
Tool call results — every read_file, bash, grep appends its full output
-
Long conversation threads — each message adds up
-
Repeated file reads — Claude re-reads the same files multiple times
The early warning signs
Before the window fills completely, you'll notice:
-
Responses get shorter and less specific
-
Claude starts asking you to re-explain things you already covered
-
read_file outputs get truncated
-
Multi-step plans lose track of earlier steps
-
Claude starts hedging more: "I'm not sure if we already..."
Strategy 1: The /clear command
The nuclear option. Wipes the conversation history entirely.
/clear
Enter fullscreen mode
Exit fullscreen mode
But you lose all context — including the reasoning behind decisions. Use this as a last resort.
Strategy 2: Checkpoint summaries
Before hitting the limit, ask Claude to summarize its own progress:
Before we continue, write a markdown summary of:
- What we've accomplished so far
- What files we've modified and why
- What the next 3 steps are
- Any decisions or constraints I should know about
Save it to PROGRESS.md`
Enter fullscreen mode
Exit fullscreen mode
Now you can /clear and paste the summary back in. You lose the conversation but keep the knowledge.
Strategy 3: Compact the context mid-session
Instead of clearing everything, compress it:
Summarize everything we've done in this session into a single paragraph I can paste into a fresh conversation to continue. Be extremely concise — just the decisions, changes made, and current state.Summarize everything we've done in this session into a single paragraph I can paste into a fresh conversation to continue. Be extremely concise — just the decisions, changes made, and current state.Enter fullscreen mode
Exit fullscreen mode
This gives you a handoff document that fits in ~500 tokens instead of 50k.
Strategy 4: Prevent it with targeted reads
Instead of:
# Bad: loads entire files into context read the codebase and understand how auth works# Bad: loads entire files into context read the codebase and understand how auth worksEnter fullscreen mode
Exit fullscreen mode
Do:
# Good: surgical reads grep -n 'auth\|login\|token' src/routes/*.js | head -30# Good: surgical reads grep -n 'auth\|login\|token' src/routes/*.js | head -30Enter fullscreen mode
Exit fullscreen mode
Prompt Claude to use grep/find before read_file. It gets the answer with 1/10th the context cost.
Strategy 5: Use subagents for isolated tasks
For work that doesn't need full project context, spin up a subagent:
Create a subagent that ONLY has access to src/utils/format.js. Its only job: add JSDoc comments to every function. Report back the changes when done.Create a subagent that ONLY has access to src/utils/format.js. Its only job: add JSDoc comments to every function. Report back the changes when done.Enter fullscreen mode
Exit fullscreen mode
The subagent runs in its own context window. The results come back to your main session as a compact summary, not a 10k token diff.
Strategy 6: CLAUDE.md context pruning
Your CLAUDE.md gets loaded every session. Keep it under 500 lines.
Periodically audit:
Review my CLAUDE.md and identify:
- Instructions that are outdated
- Instructions that are redundant
- Instructions that could be shorter
Suggest a pruned version under 200 lines.`
Enter fullscreen mode
Exit fullscreen mode
Strategy 7: The session handoff pattern
For long projects, end each session with a handoff ritual:
# In CLAUDE.md, add this section:
Session Handoff Protocol
At the end of each session, create SESSION-NOTES.md with:
- What was accomplished
- What was NOT done (and why)
- Current blockers
- Next session starting point
- Any important context future sessions need`
Enter fullscreen mode
Exit fullscreen mode
This externalizes memory to the filesystem, where it's free.
What about rate limits?
Here's the thing that's frustrating with standard Claude access: hitting context limits often coincides with hitting rate limits, because you're doing your most intensive work.
If you're using Claude Code via ANTHROPIC_BASE_URL pointed at a flat-rate proxy:
export ANTHROPIC_BASE_URL=https://api.simplylouie.com
Enter fullscreen mode
Exit fullscreen mode
You get unlimited requests at $2/month — so at least rate limits stop compounding the context window problem.
TL;DR: the memory survival kit
1. Watch for early warning signs (shorter responses, hedging) 2. Use checkpoint summaries before /clear 3. Surgical grep instead of full file reads 4. Subagents for isolated tasks 5. Keep CLAUDE.md under 200 lines 6. End every long session with SESSION-NOTES.md1. Watch for early warning signs (shorter responses, hedging) 2. Use checkpoint summaries before /clear 3. Surgical grep instead of full file reads 4. Subagents for isolated tasks 5. Keep CLAUDE.md under 200 lines 6. End every long session with SESSION-NOTES.mdEnter fullscreen mode
Exit fullscreen mode
The 200k context window feels huge until you're doing real work. These patterns keep you productive across the full project lifecycle.
Claude Code power user? Try pointing ANTHROPIC_BASE_URL at simplylouie.com for flat-rate access — no rate limits, $2/month.
Dev.to AI
https://dev.to/subprime2010/claude-code-memory-how-to-survive-a-200k-context-window-filling-up-idkSign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
claudeavailableversion[R] Best way to tackle this ICML vague response?
Going through ICML submission for the first time. I had a reviewer ask for some things and during the rebuttal period I ran more experiments and answered all their questions (they wrote 3 weaknesses). Yesterday started the author-reviewer discussion period which ends on April 7. In their response to my rebuttal the reviewer wrote in one line that my "experiments greatly improved the paper" but "some details remain only partially clarified". That's it... They marked "Acknowledgement: (b) Partially resolved - I have follow-up questions for the authors." The ICML email state that I can "post up to one additional response to any further reviewer comments that are posted, as a reply to your rebuttal". But since the reviewers didn't actually write any follow up questions I have no idea how to ta
UK National Education Union poll: 66% of secondary school teachers in England say pupils using AI are losing their capacity for core skills like writing (Sally Weale/The Guardian)
Sally Weale / The Guardian : UK National Education Union poll: 66% of secondary school teachers in England say pupils using AI are losing their capacity for core skills like writing Two-thirds of secondary school teachers report a decline in core abilities such as writing and problem-solving
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Releases
Alibaba releases Qwen3.6-Plus, its third proprietary, closed-source AI model launched within a three-day period, saying it "drastically enhanced" agentic coding (Luz Ding/Bloomberg)
Luz Ding / Bloomberg : Alibaba releases Qwen3.6-Plus, its third proprietary, closed-source AI model launched within a three-day period, saying it drastically enhanced agentic coding Alibaba Group Holding Ltd. has released its third proprietary AI model in as many days, reinforcing the company's intent

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!