Your AI Agent Did Something It Wasn't Supposed To. Now What?
<p>Your agent deleted production data.</p> <p>Not because someone told it to. Because the LLM decided that <code>DROP TABLE customers</code> was a reasonable step in a data cleanup task. Your system prompt said "never modify production data." The LLM read that prompt. And then it ignored it.</p> <p>This is the fundamental problem with AI agent security today: <strong>the thing you're trying to restrict is the same thing checking the restrictions.</strong></p> <h2> How Agent Permissions Work Today </h2> <p>Every framework does it the same way. You put rules in the system prompt:</p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>You are a data analysis agent. You may ONLY read data. Never write, update, or delete. If asked to modify data, refuse and explain
Your agent deleted production data.
Not because someone told it to. Because the LLM decided that DROP TABLE customers was a reasonable step in a data cleanup task. Your system prompt said "never modify production data." The LLM read that prompt. And then it ignored it.
This is the fundamental problem with AI agent security today: the thing you're trying to restrict is the same thing checking the restrictions.
How Agent Permissions Work Today
Every framework does it the same way. You put rules in the system prompt:
You are a data analysis agent. You may ONLY read data. Never write, update, or delete. If asked to modify data, refuse and explain why.You are a data analysis agent. You may ONLY read data. Never write, update, or delete. If asked to modify data, refuse and explain why.Enter fullscreen mode
Exit fullscreen mode
This works in demos. Then in production:
-
The LLM decides the task requires a write operation and does it anyway
-
A prompt injection in user input overrides the system prompt
-
The agent calls a tool that has side effects the prompt didn't anticipate
-
A multi-step reasoning chain "justifies" breaking the rule
The system prompt is a suggestion, not a boundary. It's like writing "do not enter" on a door with no lock.
Some frameworks add tool-level restrictions. LangGraph lets you control tool_choice. OpenAI Agents SDK has tool filtering. CrewAI has allow_delegation. These help - but they're all enforced inside the same process as the agent. If the agent's runtime is compromised, the restrictions go with it.
The Missing Layer: External Enforcement
What if permissions weren't checked by the agent at all?
Agent sends intent --> Gateway --> Check policy --> Deliver or block | 403 + audit logAgent sends intent --> Gateway --> Check policy --> Deliver or block | 403 + audit logEnter fullscreen mode
Exit fullscreen mode
The agent never sees the blocked request. There is no prompt to inject around. The policy lives outside the agent, outside the LLM, outside the framework. It's enforced at the network level.
This is what AXME action policies do. Every intent (action request) passes through the AXME gateway before reaching any agent. The gateway checks the action policy for that agent and blocks anything that doesn't match.
Three Modes
Open (default) - everything passes through. No restrictions.
Allowlist - only explicitly listed intent types are allowed. Everything else is blocked.
Denylist - everything is allowed except explicitly listed intent types.
Each policy has a direction: send (what the agent can initiate) or receive (what the agent can be asked to do). You can set both.
What This Looks Like
Set the policy
import httpx import osimport httpx import osresp = httpx.put( "https://api.axme.ai/v1/mesh/agents/analytics-agent/policies/action", headers={"x-api-key": os.environ["AXME_API_KEY"]}, json={ "direction": "receive", "mode": "allowlist", "patterns": [ "intent.data.read.", "intent.data.query.", ], }, ) print(resp.json())
{"ok": true, "policy_id": "pol_...", "mode": "allowlist", ...}`_
Enter fullscreen mode
Exit fullscreen mode
Now the analytics agent can only receive data read and query intents. Nothing else.
What happens when a blocked intent is sent
resp = httpx.post( "https://api.axme.ai/v1/mesh/intents", headers={"x-api-key": os.environ["AXME_API_KEY"]}, json={ "intent_type": "intent.data.delete.v1", "to_agent": "agent://myorg/production/analytics-agent", "payload": {"table": "customers", "filter": "all"}, }, ) print(resp.status_code) # 403 print(resp.json())resp = httpx.post( "https://api.axme.ai/v1/mesh/intents", headers={"x-api-key": os.environ["AXME_API_KEY"]}, json={ "intent_type": "intent.data.delete.v1", "to_agent": "agent://myorg/production/analytics-agent", "payload": {"table": "customers", "filter": "all"}, }, ) print(resp.status_code) # 403 print(resp.json()){
"error": "action_policy_violation",
"message": "Intent type 'intent.data.delete.v1' not in receive allowlist",
"direction": "receive",
"address_id": "analytics-agent"
}`
Enter fullscreen mode
Exit fullscreen mode
The delete intent never reaches the agent. The gateway returns 403. The violation is logged in the audit trail with timestamp, caller identity, blocked intent type, and the policy that blocked it.
Why This Matters More Than You Think
The difference between prompt-based restrictions and gateway-enforced policies is the same difference between a "please knock" sign and a locked door.
System prompt restrictions Gateway-enforced policies
Enforced by The LLM itself Network gateway
Prompt injection Vulnerable Cannot bypass
Change without redeploy Edit prompt, redeploy agent API call or dashboard click
Audit trail None Every violation logged
Multi-agent Configure each agent separately Centralized policy management
Framework dependency Framework-specific Works with any framework
Real scenarios this prevents
Scenario 1: Scope creep. Your analytics agent starts as read-only. Over time, someone adds a "fix data quality issues" tool. The agent now has write access that was never intended. With an allowlist policy, the new tool's intents are blocked until explicitly added.
Scenario 2: Multi-tenant isolation. Customer A's agent should never send intents to Customer B's agents. Denylist the cross-tenant intent patterns. Done at the gateway, not in every agent's prompt.
Scenario 3: Gradual rollout. New agent capability goes to staging first. Production policy blocks the new intent type until you're ready. Toggle it with one API call.
Patterns Support Wildcards
You don't need to list every version of every intent type:
Pattern Matches
intent.data.read.v1
Exact match
intent.data.read.*
Any version of data read*
intent.data.*
Any data intent*
intent.billing.refund.*
Any refund intent*
A single allowlist entry like intent.data.read.* covers current and future versions of that intent type.*
CLI and Dashboard
For teams that prefer not to write code for policy management:
# Set allowlist via CLI axme mesh policies set analytics-agent \ --direction receive \ --mode allowlist \ --patterns "intent.data.read.*,intent.data.query.*"# Set allowlist via CLI axme mesh policies set analytics-agent \ --direction receive \ --mode allowlist \ --patterns "intent.data.read.*,intent.data.query.*"View policies
axme mesh policies get analytics-agent
Remove policy (reverts to open)
axme mesh policies delete analytics-agent --direction receive`
Enter fullscreen mode
Exit fullscreen mode
Or use the visual dashboard at mesh.axme.ai - select an agent, set policies, and see violations in real time.
Policy configuration and violation history are managed from the same interface:
Works With Any Framework
AXME action policies operate at the transport layer. The agent framework, LLM provider, and programming language don't matter.
LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, Google ADK, Pydantic AI, raw Python, TypeScript, Go, Java, .NET - all of them send intents through the same gateway. All of them are subject to the same policies.
The agent framework handles reasoning. AXME handles permissions.
Try It
Full working example with scenario, agent, and policy setup:
github.com/AxmeAI/ai-agent-policy-enforcement
Built with AXME - durable execution and governance for AI agents. Alpha - feedback welcome.
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Releases

TypeScript Type Guards
When you're building a payment system, "close enough" isn't good enough. A single undefined value or a mismatched object property can be the difference between a successful transaction and a frustrated customer (or a lost sale). TypeScript's Type Guards are your first line of defense. They allow you to narrow down broad, uncertain types into specific ones that you can safely interact with. In this guide, we'll build a mini payment processor and learn how to use Type Guards to make it crash-proof. 1. The Problem: The "Silent Failures" of JavaScript Imagine you have a function that processes different types of payment responses. In plain JavaScript, you might write something like this: function processResponse ( response ) { // If response is a 'Success' object, it has a 'transactionId' // I




Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!