Models model benchmark announce service valuation analysis

CivicShield: A Cross-Domain Defense-in-Depth Framework for Securing Government-Facing AI Chatbots Against Multi-Turn Adversarial Attacks

arXiv cs.CRby [Submitted on 30 Mar 2026]April 1, 20262 min read1 views

arXiv:2603.29062v1 Announce Type: new Abstract: LLM-based chatbots in government services face critical security gaps. Multi-turn adversarial attacks achieve over 90% success against current defenses, and single-layer guardrails are bypassed with similar rates. We present CivicShield, a cross-domain defense-in-depth framework for government-facing AI chatbots. Drawing on network security, formal verification, biological immune systems, aviation safety, and zero-trust cryptography, CivicShield introduces seven defense layers: (1) zero-trust foundation with capability-based access control, (2) perimeter input validation, (3) semantic firewall with intent classification, (4) conversation state machine with safety invariants, (5) behavioral anomaly detection, (6) multi-model consensus verifica

View PDF HTML (experimental)

Abstract:LLM-based chatbots in government services face critical security gaps. Multi-turn adversarial attacks achieve over 90% success against current defenses, and single-layer guardrails are bypassed with similar rates. We present CivicShield, a cross-domain defense-in-depth framework for government-facing AI chatbots. Drawing on network security, formal verification, biological immune systems, aviation safety, and zero-trust cryptography, CivicShield introduces seven defense layers: (1) zero-trust foundation with capability-based access control, (2) perimeter input validation, (3) semantic firewall with intent classification, (4) conversation state machine with safety invariants, (5) behavioral anomaly detection, (6) multi-model consensus verification, and (7) graduated human-in-the-loop escalation. We present a formal threat model covering 8 multi-turn attack families, map the framework to NIST SP 800-53 controls across 14 families, and evaluate using ablation analysis. Theoretical analysis shows layered defenses reduce attack probability by 1-2 orders of magnitude versus single-layer approaches. Simulation against 1,436 scenarios including HarmBench (416), JailbreakBench (200), and XSTest (450) achieves 72.9% combined detection [69.5-76.0% CI] with 2.9% effective false positive rate after graduated response, while maintaining 100% detection of multi-turn crescendo and slow-drift attacks. The honest drop on real benchmarks versus author-generated scenarios (71.2% vs 76.7% on HarmBench, 47.0% vs 70.0% on JailbreakBench) validates independent evaluation importance. CivicShield addresses an open gap at the intersection of AI safety, government compliance, and practical deployment.

Comments: 25 pages, 17 tables, 2 figures

Subjects:

Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.29062 [cs.CR]

(or arXiv:2603.29062v1 [cs.CR] for this version)

https://doi.org/10.48550/arXiv.2603.29062

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: KrishnaSaiReddy Patil [view email] [v1] Mon, 30 Mar 2026 22:58:04 UTC (49 KB)

Original source

arXiv cs.CR

https://arxiv.org/abs/2603.29062

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelbenchmarkannounce

ProductsLive

AI News This Week: April 05, 2026 - A New Era of Rapid Development and Multimodal Intelligence

AI News This Week: April 05, 2026 - A New Era of Rapid Development and Multimodal Intelligence Published: April 05, 2026 | Reading time: ~10 min This week has been nothing short of phenomenal for the AI community, with breakthroughs and announcements that promise to revolutionize the way we develop and interact with artificial intelligence. From building personal AI agents in a matter of hours to the unveiling of cutting-edge multimodal intelligence models, the pace of innovation is not just accelerating - it's transforming the landscape of what's possible. Whether you're a seasoned developer or just starting to explore the world of AI, this week's news is a must-know, offering insights into how technology is making AI more accessible, powerful, and integrated into our daily lives. Buildin

Dev.to AI

6m34 minutes ago

Open Source AILive

Untitled

You have 50 models. Each trained on different data, different domain, different patient population. You want them to get smarter from each other. So you do the obvious thing — you set up a central aggregator. Round 1: gradients in, averaged weights out. Works fine at N=5. At N=20 you notice the coordinator is sweating. At N=50, round latency has tripled, your smallest sites are timing out, and your bandwidth budget is gone. You tune the hell out of it. Same ceiling. This is not a configuration problem. This is an architecture ceiling. The math underneath it guarantees you hit a wall. A different architecture changes the math. The combinatorics you are not harvesting Start with a fact that has nothing to do with any particular framework: N agents have exactly N(N-1)/2 unique pairwise relati

Dev.to AI

10m38 minutes ago

ProductsLive

This Week in AI: April 05, 2026 - Revolutionizing Development with Personal Agents and Multimodal Intelligence

This Week in AI: April 05, 2026 - Revolutionizing Development with Personal Agents and Multimodal Intelligence Published: April 05, 2026 | Reading time: ~10 min This week has been incredibly exciting for AI enthusiasts and developers alike. With advancements in personal AI agents, multimodal intelligence, and compact models for enterprise documents, the field is rapidly evolving. One of the most significant trends is the ability to build and deploy useful AI prototypes in a remarkably short amount of time. This shift is largely due to innovative tools and ecosystems that are making AI more accessible to individual builders. In this article, we'll dive into the latest AI news, exploring what these developments mean for developers and the broader implications for the industry. Building a Per

Dev.to AI

5m34 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 239 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsFresh

research-llm-apis 2026-04-04

Release: research-llm-apis 2026-04-04 I'm working on a major change to my LLM Python library and CLI tool. LLM provides an abstraction layer over hundreds of different LLMs from dozens of different vendors thanks to its plugin system, and some of those vendors have grown new features over the past year which LLM's abstraction layer can't handle, such as server-side tool execution. To help design that new abstraction layer I had Claude Code read through the Python client libraries for Anthropic, OpenAI, Gemini and Mistral and use those to help craft curl commands to access the raw JSON for both streaming and non-streaming modes across a range of different scenarios. Both the scripts and the captured outputs now live in this new repo. Tags: llm , apis , json , llms

Simon Willison Blog

1mabout 6 hours ago

ModelsFresh

scan-for-secrets 0.1

Release: scan-for-secrets 0.1 I like publishing transcripts of local Claude Code sessions using my claude-code-transcripts tool but I'm often paranoid that one of my API keys or similar secrets might inadvertently be revealed in the detailed log files. I built this new Python scanning tool to help reassure me. You can feed it secrets and have it scan for them in a specified directory: uvx scan-for-secrets $OPENAI_API_KEY -d logs-to-publish/ If you leave off the -d it defaults to the current directory. It doesn't just scan for the literal secrets - it also scans for common encodings of those secrets e.g. backslash or JSON escaping, as described in the README . If you have a set of secrets you always want to protect you can list commands to echo them in a ~/.scan-for-secrets.conf.sh file. Mi

Simon Willison Blog

2mabout 3 hours ago

ModelsLive

Harvard Proved Emotions Don't Make AI Smarter — That's Exactly Why You Need Soul Spec

The Myth Dies Hard "I'll tip you $200 if you get this right." "This is really important to my career." "I'm so frustrated — please help me." If you've spent any time on AI Twitter, you've seen people swear that emotional prompting makes LLMs perform better. A few anecdotal successes became gospel. The technique spread. Now Harvard has the data. It doesn't work. What the Research Actually Shows A team from Harvard and Bryn Mawr ( arXiv:2604.02236 , April 2026) ran a systematic study across 6 benchmarks, 6 emotions, 3 models (Qwen3-14B, Llama 3.3-70B, DeepSeek-V3.2), and multiple intensity levels. Finding 1: Fixed emotional prefixes have negligible effect. Adding "I'm angry about this" or "This makes me so happy" before your prompt? Across GSM8K, BIG-Bench Hard, MedQA, BoolQ, OpenBookQA, and

Dev.to AI

4m32 minutes ago

ModelsLive

Self-Improving Python Scripts with LLMs: My Journey

As a developer, I've always been fascinated by the idea of self-improving code. Recently, I've been experimenting with using Large Language Models (LLMs) to make my Python scripts more autonomous and efficient. In this article, I'll share my experience with integrating LLMs into my Python workflow and how it has revolutionized my development process. I'll also provide a step-by-step guide on how to get started with making your own Python scripts improve themselves using LLMs. My journey with LLMs began when I stumbled upon the llm_groq module, which allows you to interact with LLMs using a simple and intuitive API. I was impressed by the accuracy and speed of the model, and I quickly realized that it could be used to improve my Python scripts. The first step in making my scripts self-impro

Dev.to AI

4m24 minutes ago