Models claude gemini model release open-source assistant

Agentic Engineering Journey — Brain Dump

DEV Communityby dimitriApril 3, 20264 min read2 views

🧒Explain Like I'm 5Simple language

Hey there, little explorer! Imagine you have a super smart robot friend, like a toy robot that can talk and build things.

Sometimes, this robot forgets what you just told it! It's like it has a tiny brain that can only hold a few ideas at a time. So, grown-ups try to help it remember by writing down notes, like a little diary for the robot. This diary helps it remember all the fun things it learned and what it needs to do next.

The grown-ups also teach the robot to look at its diary before it starts a new game, so it doesn't make silly mistakes. It's like making sure your robot friend always knows the rules of the game! This makes the robot super good at helping people. Yay!

1. Where It Started: Memory and Context I started with Claude Code around April 2025. The first real step was recognising that Claude's native memory was essentially useless. The workaround was using markdown files as persistent memory stores, editable both through Claude and tools like Cursor. That opened the door to storing not just session notes but also instructions, roles, and agent skills — anything that would otherwise be forgotten across context resets. But the fundamental problem remained: at some point the context window fills, the model gets amnesia, and starts behaving destructively. Cursor handled this somewhat better at the time. Gemini had an edge due to its larger context window (already at 1M tokens), though at a cost. Neither was a real solution. 2. The Core Principle Tak

1. Where It Started: Memory and Context

I started with Claude Code around April 2025. The first real step was recognising that Claude's native memory was essentially useless. The workaround was using markdown files as persistent memory stores, editable both through Claude and tools like Cursor. That opened the door to storing not just session notes but also instructions, roles, and agent skills — anything that would otherwise be forgotten across context resets.

But the fundamental problem remained: at some point the context window fills, the model gets amnesia, and starts behaving destructively. Cursor handled this somewhat better at the time. Gemini had an edge due to its larger context window (already at 1M tokens), though at a cost. Neither was a real solution.

2. The Core Principle

Taking a step back from tooling led to the central insight the whole framework is built on:

The better the prompt, the better the output. The better the instruction — and the context around it — the higher the likelihood of a good result.

This is no different from how you'd brief a human. Context, clarity, traceability, constraints — all of it shapes the quality of what gets executed. The question became: how do you systematically generate and maintain that context?

3. The Agentic Engineering Framework

To produce good, consistent context, you need to capture:

What has been done before — every instruction, tool call, output, error, pivot, and decision
What the goals and architecture look like — what was decided and why
What is connected to what — if you change function X, what does that break elsewhere?

This last point introduced the concept of the blast radius — borrowed from physical and industrial engineering. It describes the potential impact zone of any given change.

Context Fabric

Captures the full history of work: what was done, what failed, what changed, what was decided. When an agent starts a new task, it can look back at relevant prior context rather than starting blind.

Component Fabric

Provides structural awareness — understanding how components relate to each other so that an action's blast radius can be assessed before it's taken.

The Prime Directive

Nothing gets done without a task. Every action must be linked to a task. This enforces traceability and prevents autonomous drift.

Enforcement is the hard part. Git hooks work sometimes. Claude Code doesn't reliably respect the constraint — partly due to the stochastic nature of LLMs, partly due to permissive execution environments. If broad tool permissions are granted, there's nothing structurally preventing the model from bypassing the rule.

4. TermLink

The challenge: how to coordinate multiple agents in a reliable, deterministic way.

The idea behind TermLink is that if terminal sessions are initialised in a known state, you can inject into them — essentially simulating a USB keyboard over the terminal link, sending ASCII sequences directly to the session.

In practice this works well. The weak point is that Claude Code sometimes falls back to calling claude -p through PTY rather than opening a terminal and running it properly. That loses the interactive feedback loop — the back-and-forth that makes the coordination meaningful.

TermLink is also now using a network socket interface. This opens up communication across machines — and with it, the possibility of real orchestration: routing tasks to different agents, mixing providers, and matching the right model to the right type of work reliably.

5. Proof Is in the Pudding

The proof is in the eating of the pudding. I'm using the framework to build real things I can use, and that's where you find out if it actually works.

Open-Claw ingestion: Took the open-claw codebase, ingested it through the context fabric, exposed it for browsing and querying. Used it to extract improvement ideas for the Agentic Engineering Framework itself. The model identified enhancements, formatted them against the standard task structure, and dispatched them to the TermLink agent, which pulled from the knowledge repository and started working autonomously. It worked.

Email archiver: Started as a utility to consolidate ~70K emails across Hotmail and Google domain accounts into a single searchable archive (useful for things like digging up tax receipts). Evolved into a fuller email client with AI capabilities — translation, generation, support for both local and remote models. Still in progress. A rough first release has been pushed to GitHub. The focus has shifted toward a more controlled, personal-assistant-style interface rather than trying to match full open-source alternatives.

If you want to test-drive the OpenClaw Fabric Explorer or the AI Email Personal Assistant, drop a comment: EXPLORER. You find the other repo's here: The Engineering Framework Termlink

Original source

DEV Community

https://dev.to/irrindar/agentic-engineering-journey-brain-dump-41gh

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

claudegeminimodel

ModelsLive

Intel Arc B70 Benchmarks/Comparison to Nvidia RTX 4070 Super

Good day everyone! You may remember me from such posts as Getting An Intel Arc B70 Running For LLM Inference on a Dell Poweredge R730XD . Maybe not. Probably not... Anyway, I've had this card for about a week now, I ordered it on launch day and have been beating my head against a wall with drivers and other issues until finally getting it running properly! Since then, I've realized there's a significant lack of people actually testing this card and getting some real benchmarks out into the community. Something something be the change you want to see in the world, something something... So I've done some testing, and this certainly won't be the last of my tests and benchmarks, but it'll certainly be the first. I know what is on the community's mind. I hear you ask "How does the new Intel ca