What Nobody Tells You About Building a Protocol for AI Agents
<p>For the past few months, I've been building ARSIA Protocol as a part-time open source project, an open compliance layer for AI agents, designed to sit above MCP (Anthropic) and A2A (Google).</p> <p>The first ideas came in July 2025. Back then it was just a question: what if there was a compliance layer that sat above agent communication protocols? For months, that's all it was, an idea slowly taking shape. I'd study the regulatory landscape, read the EU AI Act, sketch mental models, scribble notes. Weekend conversations about what the architecture could look like. No code, no repo, no rush. The idea needed time to mature, and I let it.</p> <p>By November I had rough concept drafts. By January the mental model was solid enough to start testing assumptions on paper. But it wasn't until Ma
For the past few months, I've been building ARSIA Protocol as a part-time open source project, an open compliance layer for AI agents, designed to sit above MCP (Anthropic) and A2A (Google).
The first ideas came in July 2025. Back then it was just a question: what if there was a compliance layer that sat above agent communication protocols? For months, that's all it was, an idea slowly taking shape. I'd study the regulatory landscape, read the EU AI Act, sketch mental models, scribble notes. Weekend conversations about what the architecture could look like. No code, no repo, no rush. The idea needed time to mature, and I let it.
By November I had rough concept drafts. By January the mental model was solid enough to start testing assumptions on paper. But it wasn't until March 2026, with the architecture firmly settled in my head, that I sat down and wrote the first real draft of the specification. That incubation period of seven months of thinking before building turned out to be one of the best decisions I made.
Two people. Six specs. Two SDKs. A CLI. A server. 900+ tests.
Here's what was actually hard.
1. You'll rewrite the architecture at least twice
We started with the obvious model: the agent implements the protocol. Discovery endpoints, EdDSA signing, compliance fields, audit trail, all inside the agent code.
It took us weeks to realize this was backwards. A developer with a working LangGraph or CrewAI agent doesn't want to rewrite it. Nobody does. The MCP succeeded not because of its protocol. It succeeded because it hides it.
So we pivoted. Hard. We built a sidecar proxy (ARSIA Client) and a gateway (ARSIA Server). The agent never touches the protocol. The developer changes one environment variable. The organization deploys a server. Done.
That pivot invalidated dozens of design decisions, hundreds of tests, and an entire CLI workflow. We rebuilt it anyway.
2. Specs are never "done"
We started with 8 specification documents. Then we realized two of them (Compliance and Onboarding) didn't deserve to be standalone, their content belonged inside the other five. So we merged them. That meant rewriting every cross-reference across six documents, verifying 34 files, and hunting for stale links that pointed to specs that no longer existed.
A protocol spec isn't code. You can't run a linter on normative language. When section 4.3.6 of Core references section 7 of State, and you renumber State's sections during a merge, nothing breaks, until an implementer reads it six months later and builds the wrong thing.
We found 8 real gaps in our own spec after publishing Draft-01. Token refresh unspecified. Audit trail queries with no cross-agent authentication. GDPR data portability with no defined export format. WebSocket messages with no audit mapping. Each one small enough to ship without, dangerous enough to bite someone in production.
3. Naming will haunt you
We named our Python import namespace arsia.. Simple, clean. Then we discovered another company had "arsia" registered at the EU trademark office. Class 9. Software.
So we migrated everything. arsia.* became arsiaprotocol.*. The CLI went from arsia to arsiactl. Agent IDs, payload types, capability strings, PyPI package names, npm scopes, all of it. Over 100 files touched across three repositories. One migration, zero tolerance for leftover references.
The lesson: pick your canonical names on day one, search every trademark registry, and never use a short unqualified namespace.
4. Part-time open source, full-time complexity
There is no QA team. There is no DevRel team. There is no product manager. There are two people building this in their free time writing specs, building SDKs in Python and TypeScript, creating Docker demos, writing a CLI with four namespaces, building a server with a six-guard enforcement pipeline, and preparing a public comment for NIST.
My routine looks like this: I have two young kids. I put them to bed, and then around 10 PM I sit down and work until 2 or 3 AM at least a few days a week. On weekends I sometimes go through the entire night. Not because I have to. Because I get so absorbed that I lose track of time. The protocol pulls you in. There's always one more cross-reference to verify, one more edge case to handle, one more test to write.
During the day, I'm fully dedicated to my job. But I leave AI agents running in the background executing tasks, running test suites, checking automations. At lunch I glance at the results. Then I don't look again until the night session. The automated parts (tests, conformance checks, CI) run fine on their own. But the hard parts can't be delegated: architecture decisions, spec writing, prototyping new primitives. Those require deep, uninterrupted thought, and that's what the late nights are for.
5. How AI became my force multiplier
I have to be honest about this: building a protocol of this scope as a part-time project would have been nearly impossible without AI, specifically Claude. It fundamentally changed my prototyping speed.
When I have an architecture idea at midnight, I can prototype it in conversation, exploring trade-offs, stress-testing edge cases, generating test scaffolds, drafting spec language in a fraction of the time it would take solo. Claude doesn't replace the thinking. The seven months of incubation, the mental modeling, the architectural decisions are mine. But once I know what I want to build, AI accelerates the how dramatically. It's the difference between spending a weekend on a proof of concept and spending an hour.
The spec reorganization is a good example. Merging two specs into the existing five, updating every cross-reference across six documents, verifying consistency, that's tedious, error-prone work that can take days. With AI assistance, it took hours. The creative decisions were still mine. The execution was radically faster.
For solo builders and small teams working on ambitious open source projects in their free time, AI isn't a luxury. It's what makes the project viable at all.
6. Nobody knows they need you yet
This is the hardest one. We built a protocol that solves EU AI Act compliance, GDPR audit trails, MiFID II retention rules, and human oversight signaling at the protocol level, not the application level. It's the kind of thing that European enterprises will be legally required to solve in the next 12 to 18 months.
But today, most AI agent developers don't know they have a compliance problem. They're building cool things with tool-calling and multi-agent workflows. Regulatory enforcement feels distant. Until it isn't.
So you're building infrastructure for a problem that's real but not yet felt. You're competing against "just ship it" culture with a product that says "ship it, but with an audit trail." That's a tough sell, until the first fine lands.
What kept us going
Honestly? Those 2 AM moments when arsiactl scaffold agent spins up a working AI agent with a local LLM, makes a real GitHub API call through the compliance layer, and produces a signed, auditable envelope, all without the developer writing a single line of protocol code. That moment when the protocol becomes invisible is when you know the architecture is right. And it makes the next late-night session feel worth every lost hour of sleep.
ARSIA Protocol is open source (Apache 2.0). The specification (Draft-01) is available now. The SDKs for Python and TypeScript will be released in the coming days, and the ARSIA Client (sidecar proxy) and ARSIA Server (organizational gateway), the pieces that make compliance invisible to the developer, are coming soon after. If you're building AI agents for regulated markets, we'd love your feedback.
I hope you like it: arsiaprotocol.org | GitHub
Kirk Ferreira, Creator of ARSIA Protocol
DEV Community
https://dev.to/kirk42/what-nobody-tells-you-about-building-a-protocol-for-ai-agents-hljSign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
claudemodelrelease
China cuts cost of military-grade infrared chips to as little as a few dozen USD
A research team at a Chinese university has developed a new way to make high-end infrared chips that could slash their cost dramatically and improve the performance of smartphone cameras and self-driving cars. The key breakthrough was finding a way to make the chips using conventional manufacturing techniques, rather than the exotic, costly materials that were relied on before. Mass production is set to begin by the end of the year, according to a press release from Xidian University. The chips...

Got Gemma 4 running locally on CUDA, both float and GGUF quantized, with benchmarks
Spent the last week getting Gemma 4 working on CUDA with both full-precision (BF16) and GGUF quantized inference. Here's a video of it running. Sharing some findings because this model has some quirks that aren't obvious. Performance (Gemma4 E2B, RTX 3090): | Config | BF16 Float | Q4_K_M GGUF | |-------------------------|------------|-------------| | short gen (p=1, g=32) | 110 tok/s | 170 tok/s | | long gen (p=512, g=128) | 72 tok/s | 93 tok/s | The precision trap nobody warns you about Honestly making it work was harder than I though. Gemma 4 uses attention_scale=1.0 (QK-norm instead of the usual 1/sqrt(d_k) scaling). This makes it roughly 22x more sensitive to precision errors than standard transformers. Things that work fine on LLaMA or Qwen will silently produce garbage on Gemma 4: F1
![[llama.cpp] 3.1x Q8_0 speedup on Intel Arc GPUs - reorder optimization fix (PR submitted)](https://d2xsxph8kpxj0f.cloudfront.net/310419663032563854/konzwo8nGf8Z4uZsMefwMr/default-img-neural-network-P6fqXULWLNUwjuxqUZnB3T.webp)
[llama.cpp] 3.1x Q8_0 speedup on Intel Arc GPUs - reorder optimization fix (PR submitted)
TL;DR : Q8_0 quantization on Intel Xe2 (Battlemage/Arc B-series) GPUs was achieving only 21% of theoretical memory bandwidth. My AI Agent and I found the root cause and submitted a fix that brings it to 66% - a 3.1x speedup in token generation. The problem : On Intel Arc Pro B70, Q8_0 models ran at 4.88 t/s while Q4_K_M ran at 20.56 t/s; a 4x gap that shouldn't exist since Q8_0 only has 1.7x more data. After ruling out VRAM pressure, drivers, and backend issues, we traced it to the SYCL kernel dispatch path. Root cause : llama.cpp's SYCL backend has a "reorder" optimization that separates quantization scale factors from weight data for coalesced GPU memory access. This was implemented for Q4_0, Q4_K, and Q6_K - but Q8_0 was never added. Q8_0's 34-byte blocks (not power-of-2) make the non-r
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Products

Tech companies are cutting jobs and betting on AI. The payoff is far from guaranteed
AI experts say we’re living in an experiment that may fundamentally change the model of work Sign up for the Breaking News US email to get newsletter alerts in your inbox Hundreds of thousands of tech workers are facing a harsh reality. Their well-paying jobs are no longer safe. Now that artificial intelligence (AI) is here, their futures don’t look as bright as they did a decade ago. As US tech companies have ramped up investments in AI, they’ve slashed a staggering number of jobs. Microsoft cut 15,000 workers last year . Amazon laid off 30,000 employees in the last six months. Financial-services company Block eliminated more than 4,000 people, or 40% of its workforce, in February. Meta laid off more than 1,000 in the last six months, and, according to a Reuters report, may cut 20% of all

Resume Skills Section: Best Layout + Examples (2026)
Your skills section is the most-scanned part of your resume after your name and current title. ATS systems use it for keyword matching. Recruiters use it as a 2-second compatibility check. If it's poorly organized, buried at the bottom, or filled with the wrong skills, both audiences move on. Where to Place Your Skills Section Situation Best Placement Why Technical role (SWE, DevOps, data) Below name, above experience Recruiters check your stack before reading bullets Non-technical role (PM, marketing, ops) Below experience Experience and results matter more Career changer Below name, above experience Establishes relevant skills before unrelated job titles New grad / intern Below education, above projects Education sets context, skills show what you can do The rule: place skills where they

How AI Is Transforming Cybersecurity and Compliance — A Deep Dive into PCI DSS
The intersection of artificial intelligence and cybersecurity is no longer a future concept — it is the present reality shaping how organizations defend their data, detect threats, and demonstrate regulatory compliance. As cyber threats grow in sophistication and volume, traditional rule-based security tools are struggling to keep pace. AI is filling that gap with speed, precision, and adaptability that human analysts alone cannot match. Nowhere is this transformation more consequential than in the world of payment security and compliance. The Payment Card Industry Data Security Standard (PCI DSS) — the global framework governing how organizations handle cardholder data — has long been a compliance burden for businesses of all sizes. AI is now fundamentally changing how companies achieve,

Securing Plex on Synology NAS with Post-Quantum Cryptography via Cloudflare Tunnel
Introduction Securing remote access to a Plex media server hosted on a Synology NAS device presents a critical challenge, particularly in the face of advancing quantum computing capabilities. Traditional encryption algorithms, such as RSA and Elliptic Curve Cryptography (ECC), rely on the computational infeasibility of tasks like integer factorization and discrete logarithm problems. Quantum computers, leveraging Shor’s algorithm, can solve these problems exponentially faster, rendering traditional encryption obsolete. This vulnerability is not a speculative future concern but an imminent threat, especially for internet-exposed services like Plex. Without post-quantum cryptography (PQC), Plex servers—and the sensitive data stored on Synology NAS devices—are susceptible to quantum-enabled d


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!