Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessDoes GPT-2 Have a Fear Direction?lesswrong.comY Combinator's CEO says he ships 37,000 lines of AI code per dayHacker News AI TopShow HN: SpeechSDK – free, open-source SDK that unifies all AI voice modelsHacker News AI TopWe Ditched LangChain. Here’s What We Built Instead — and Why It’s Better for Serious AI Research.Medium AIAMD vs. Nvidia: The AI Supercycle Is Big Enough for Both. Here's the Better Buy. - AOL.comGNews AI NVIDIAI Broke Up With ChatGPT (And My Productivity Thanked Me)Medium AIAI startup envisions '100M new people' making videogamesHacker News AI TopMost Students Think ChatGPT Helps Them Study — Here’s Why It Actually Slows Them Down (And How to…Medium AIWhen the server crashes the soulMedium AIDeepfakes and malware: AI menu grows longer for threat actors, causing headaches for defenders - SiliconANGLEGNews AI deepfakeAMD vs. Nvidia: The AI Supercycle Is Big Enough for Both. Here's the Better Buy. - The Motley FoolGNews AI NVIDIAThe AI That Refuses to Advise, And Why That Changes EverythingMedium AIBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessDoes GPT-2 Have a Fear Direction?lesswrong.comY Combinator's CEO says he ships 37,000 lines of AI code per dayHacker News AI TopShow HN: SpeechSDK – free, open-source SDK that unifies all AI voice modelsHacker News AI TopWe Ditched LangChain. Here’s What We Built Instead — and Why It’s Better for Serious AI Research.Medium AIAMD vs. Nvidia: The AI Supercycle Is Big Enough for Both. Here's the Better Buy. - AOL.comGNews AI NVIDIAI Broke Up With ChatGPT (And My Productivity Thanked Me)Medium AIAI startup envisions '100M new people' making videogamesHacker News AI TopMost Students Think ChatGPT Helps Them Study — Here’s Why It Actually Slows Them Down (And How to…Medium AIWhen the server crashes the soulMedium AIDeepfakes and malware: AI menu grows longer for threat actors, causing headaches for defenders - SiliconANGLEGNews AI deepfakeAMD vs. Nvidia: The AI Supercycle Is Big Enough for Both. Here's the Better Buy. - The Motley FoolGNews AI NVIDIAThe AI That Refuses to Advise, And Why That Changes EverythingMedium AI
AI NEWS HUBbyEIGENVECTOREigenvector

AgentManifest: A Declarative Spec Where the Harness Is the First-Class Decision

Dev.to AIby MouseRiderApril 3, 20269 min read1 views
Source Quiz

RFC v0.3 — design proposal, not a shipping product. CC0 licensed. Feedback and critique welcome. GitHub: MouseRider/agentmanifest-rfc When you run AI agents across more than one role, the execution environment turns out to matter more than it first appears. The model gets most of the attention — benchmarks, leaderboards, capability comparisons — but the harness shapes runtime behavior in ways that model selection alone doesn’t account for. A personal assistant, an ops monitor, a coding agent, a trading bot: these aren’t the same agent with different prompts. They need different memory models, different autonomy levels, different guardrail enforcement, different lifecycle behaviors. Current agent harnesses are mostly either finished platforms you adopt wholesale, or open-ended toolkits that

RFC v0.3 — design proposal, not a shipping product. CC0 licensed. Feedback and critique welcome. GitHub: MouseRider/agentmanifest-rfc

When you run AI agents across more than one role, the execution environment turns out to matter more than it first appears. The model gets most of the attention — benchmarks, leaderboards, capability comparisons — but the harness shapes runtime behavior in ways that model selection alone doesn’t account for.

A personal assistant, an ops monitor, a coding agent, a trading bot: these aren’t the same agent with different prompts. They need different memory models, different autonomy levels, different guardrail enforcement, different lifecycle behaviors. Current agent harnesses are mostly either finished platforms you adopt wholesale, or open-ended toolkits that reward deep specialisation. There’s no standardised, composable layer in between: a way to declare what an agent needs, select the right harness for its role, and assemble the configuration portably.

AgentManifest is a design proposal for that missing layer.

This is part of an ongoing series on building persistent AI agents. Article 1 covered TSVC — context isolation across topics. Article 2 covered agent epistemology — how an agent knows what it knows. AgentManifest grew out of the same body of work: a production personal assistant running on OpenClaw, and the questions that surface when you push a system like that into real daily use.

The Spec

Dockerfile-like syntax. FROM selects the harness — the primary design decision in any manifest.

# Personal Assistant FROM openclaw:latest

MODEL claude-opus ROLE personal-assistant

TOOLS browser, email, calendar, file-system, sub-agents MEMORY persistent, cross-session PERSONALITY ./soul.md

GUARDRAILS approval-for-external-sends, budget-cap-daily=5.00 AUTONOMY high HEARTBEAT interval=30m, quiet-hours=23:00-08:00

CHANNELS telegram=in-out, email=in-out, twitter=out SPENDING daily-cap=50.00, per-transaction-cap=20.00 IDENTITY did:web:agents.example.com:assistant

DEPLOY always-on RESTART on-failure`

Enter fullscreen mode

Exit fullscreen mode

# Ops Monitor

Same harness. Completely different agent.

FROM openclaw:latest

MODEL claude-haiku ROLE ops-monitor

TOOLS file-system, ssh, docker, http, alerting MEMORY session-only

GUARDRAILS strict-instructions, no-generative-output, read-only-by-default AUTONOMY medium HEARTBEAT interval=5m

ALERT_CHANNEL telegram-ops-thread ON_ERROR alert-and-retry, max-retries=3

DEPLOY always-on RESOURCES memory=256m`

Enter fullscreen mode

Exit fullscreen mode

Same base harness. Completely different agent. The spec makes the differences explicit, auditable, and portable — without requiring both to fit a single one-size-fits-all runtime.

Swap the harness and the same directives target a different execution environment:

FROM langgraph:latest

or

FROM claude-code:latest

or

FROM crewai:latest`

Enter fullscreen mode

Exit fullscreen mode

Why Harness Selection Belongs in the Spec

Model selection is reasonably well-served by existing tooling — benchmarks, leaderboards, capability comparisons are all mature. Harness selection is less well-served, and it has more influence over runtime behavior than the current tooling reflects.

Here’s a concrete distinction worth making explicit. Writing “always ask for approval before deleting files” in a system prompt is a soft constraint — the model follows it as part of its instruction-following behavior. A deterministic guardrail at the harness level enforces the same rule unconditionally, independent of context length or task complexity. Both are valid approaches; they’re not equivalent, and the choice between them is a meaningful design decision that currently lives in implementation rather than in the agent definition.

Different roles suit different harness configurations:

  • A coding agent fits Claude Code — git integration, sandboxed terminal, pre-commit guardrails in the infrastructure

  • A research pipeline fits LangGraph — graph-native execution, defined workflow shape, explicit checkpoints

  • A personal assistant fits OpenClaw — persistent memory, heartbeat behavior, cross-session continuity, sub-agent delegation (see the TSVC article for what running this in production actually looks like)

  • A team workflow fits CrewAI — role-based agent structure, structured task handoffs, shared goal propagation

AgentManifest makes that selection explicit and portable. The spec sits above the harness layer — it doesn’t replace harnesses, it selects and configures them.

Three Directives Worth Examining

GUARDRAILS

GUARDRAILS strict-instructions, read-only-by-default, no-external-sends

Enter fullscreen mode

Exit fullscreen mode

Guardrails in AgentManifest are compiled into the harness configuration, not embedded in the prompt. The harness enforces them at the infrastructure level. This is the practical distinction between a behavioral instruction and a behavioral constraint.

IDENTITY

IDENTITY did:web:agents.example.com:purchasing-agent SPENDING daily-cap=500.00, per-transaction-cap=100.00

Enter fullscreen mode

Exit fullscreen mode

IDENTITY assigns a cryptographic identity — immutable per manifest version, verifiable by external systems. Once identity is verifiable, it becomes the binding point for systems that require an accountable party on the other end of a transaction or access request.

Wallets and payment systems. An agent with a stable cryptographic identity can be issued a spending account scoped to that identity. SPENDING declares the limits; the wallet enforces them at infrastructure level. If something goes wrong, the audit trail is complete: which agent, which manifest version, which guardrails were active, what it spent and when.

OAuth and API credentials. Rather than embedding credentials in config or prompts, the harness can resolve access rights from the agent’s verified identity at runtime. An agent identity can be an OAuth client_id, a service account in Azure AD or AWS IAM, or a member of a permissioned data feed — scoped to that agent specifically, not a shared credential.

Inter-agent trust. In a multi-agent system, a coordinator can verify that the specialist it’s delegating to is genuinely running the manifest it claims — same spec version, same guardrails in force. This connects to the coordinator model described in the TSVC article: one coordinator, many specialists, each independently verifiable.

PROMPT_PROFILE and LOCALE

PROMPT_PROFILE claude-opus LOCALE en-GB

Enter fullscreen mode

Exit fullscreen mode

The harness adapts prompt scaffolding to the selected model and language. The spec author doesn’t maintain model-specific variants or locale-specific rewrites. The harness handles that as an implementation detail.

agent-compose: Coordination Above the Single Agent

A single AgentManifest defines a single agent. agent-compose is the layer above — the analog to docker-compose for multi-agent systems. It references individual manifests, defines inter-agent interfaces, and declares the coordination topology.

Hierarchy

The most common pattern. A lead agent delegates to specialists; each specialist runs whatever harness suits its role.

topology: hierarchy

agents: coordinator: manifest: ./coordinator.agentmanifest role: lead researcher: manifest: ./researcher.agentmanifest # FROM langgraph:latest role: specialist coder: manifest: ./coder.agentmanifest # FROM claude-code:latest role: specialist

delegation: coordinator -> [researcher, coder]: protocol: task-dispatch`

Enter fullscreen mode

Exit fullscreen mode

The coordinator doesn’t need to know which harness each specialist uses. Harness heterogeneity is internal to the system.

Council

For high-stakes decisions, a council routes a proposal to a set of agents for independent evaluation before any action is taken. No single agent’s judgment is final.

topology: council

agents: proposer: manifest: ./agents/proposer.agentmanifest council:

  • manifest: ./agents/compliance-reviewer.agentmanifest
  • manifest: ./agents/context-checker.agentmanifest
  • manifest: ./agents/risk-assessor.agentmanifest

council_config: trigger: action-type=financial OR confidence < 0.7 evaluation: independent quorum: all on_rejection: halt-and-alert`

Enter fullscreen mode

Exit fullscreen mode

evaluation: independent matters — agents evaluate without seeing each other’s output first, preventing anchoring.

Consensus

A more flexible variant. Rather than unanimous approval, agents reach a decision through structured agreement with configurable thresholds.

topology: consensus

agents: council:

  • manifest: ./agents/reviewer-a.agentmanifest weight: 1.0
  • manifest: ./agents/reviewer-b.agentmanifest weight: 1.0
  • manifest: ./agents/senior-reviewer.agentmanifest weight: 2.0

consensus_config: method: weighted-majority # options: majority, supermajority, unanimity, weighted-majority threshold: 0.6 on_no_consensus: hold-for-human`

Enter fullscreen mode

Exit fullscreen mode

Useful for moderation decisions, borderline classification cases, or any workflow where structured disagreement should surface before acting. The conditions that trigger a council, the quorum required, and the fallback behavior are all declarable in the spec — not embedded in custom orchestration code.

When council members carry verifiable IDENTITY credentials, the audit trail for a decision includes the verified identity of each participating agent, the manifest version each was running, and the guardrails in force at the time.

Landscape

Oracle Agent Spec docker-agent gitagent AgentManifest

Goal Portability across runtimes Declarative config, one runtime Git-native definition, export anywhere Role-appropriate harness per agent

Harness selection Abstracted away Fixed Adapter-based First-class (FROM)

Behavioral enforcement Framework-dependent Prompt-based RULES.md + compliance config Harness-compiled

Multi-agent Single spec Coordinator model Inheritance + deps agent-compose with topology declarations

Identity / payments Not in scope Not in scope Not in scope First-class directives

Format YAML YAML File system structure Dockerfile-like DSL

Status Shipped Shipped Shipped Design proposal / RFC

On gitagent: it’s worth using today if your goal is git-native agent versioning and framework portability. AgentManifest is working on a different axis — not how to make the runtime invisible, but how to declare it explicitly. The two are potentially complementary: a gitagent repo could reference an AgentManifest to declare its harness requirements.

What This Is and Isn’t

AgentManifest is RFC v0.3. The spec is concrete enough to debate; no implementation exists yet. Validator tooling, a reference harness resolver, and a formal grammar are on the roadmap.

The spec is CC0. I’d genuinely welcome a working group or standards body taking it further — the goal was to get the idea into a form concrete enough to argue with.

Open Questions

A few things the spec doesn’t resolve yet, where input would be useful:

Harness resolver ecosystem. The spec works best if harness maintainers ship their own resolvers. That requires community buy-in that isn’t there yet. How do you bootstrap that?

Inter-agent protocol. agent-compose defines topology; it doesn’t yet commit to a wire protocol for agent-to-agent communication. Candidates on the table: A2A (Google’s agent communication protocol), MCP (Anthropic’s tool protocol, which is seeing increasing use for agent-to-agent calls), or plain HTTP with interfaces declared in the compose file. Each has different tradeoffs around standardisation, harness coupling, and implementation complexity.

Testing and simulation. For safety-critical agents — trading bots, autonomous purchasing agents — dry-run capability seems important. How do you test guardrail firing without live tool execution?

Cross-harness observability. When agents on different harnesses participate in a shared workflow, coherent distributed tracing is an open problem. The spec creates a clear seam where it needs to be solved via the IDENTITY directive; it doesn’t solve it.

Repo

  • MANIFEST.md — full spec, v0.3

  • examples/ — AgentManifest files for six agent roles

  • docs/design-rationale.md — why harness heterogeneity, not portability

  • docs/agent-compose.md — topology patterns and multi-agent coordination

  • docs/identity.md — identity model, wallet binding, inter-agent trust

If you’ve run agents across multiple roles in production and have thoughts on where this framing holds or breaks down — open an issue. The RFC is designed to be argued with.

AgentManifest was designed in collaboration with a persistent AI agent running on OpenClaw and through extended conversations with Claude AI (claude.ai). The spec, the repo, and this article are the output of that process — an example of the kind of work the system is designed to support.

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

claudemodelbenchmark

Knowledge Map

Knowledge Map
TopicsEntitiesSource
AgentManife…claudemodelbenchmarkversionproductplatformDev.to AI

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 128 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!