Scanned 500 AI agent repos for bugs, nobody thinks of infinite loops

Hacker News AI Topby InkogApril 4, 20263 min read1 views

Article URL: https://inkog.io/report Comments URL: https://news.ycombinator.com/item?id=47642960 Points: 2 # Comments: 1

Research Report

Findings from scanning 500+ open-source AI agent projects

The largest security analysis of the AI agent ecosystem. Original data from automated static analysis — not surveys or interviews.

of repos had at least one vulnerability

failed EU AI Act Article 14 (human oversight)

11,705

total findings across all repositories

Enter your work email. Instant PDF download + a follow-up with key takeaways.

500+

Repos Scanned

85%

With Findings

63%

CRITICAL/HIGH

25%

Article 14 Fail

What you'll learn

500+ repos. 11,705 findings. 10 frameworks compared. Here's what the data reveals.

Which vulnerability appears in 4 out of 5 agent repos?

The top 10 vulnerability types ranked by prevalence — and why the #1 finding isn't prompt injection.

Which framework has 3x more critical findings than average?

Head-to-head security comparison across LangChain, CrewAI, AutoGen, pydantic-ai, MCP servers, and more.

Why 25% of repos fail EU AI Act Article 14

Compliance readiness scores for every repo. Article-by-article breakdown of where the ecosystem falls short.

MCP servers: the new attack surface nobody is auditing

The first large-scale security audit of MCP server repositories. Tool poisoning, argument injection, and credential exposure.

What goes wrong in repos with 25K+ stars

Anonymized deep-dives into popular frameworks. High star counts don't mean high security — here's the proof.

The 5 fixes that eliminate 80% of findings

Actionable remediation guidance for developers, security teams, and CISOs. Mapped to OWASP Agentic Top 10 and NIST AI RMF.

Methodology

Discovery

40 GitHub search queries targeting AI agent frameworks (LangChain, CrewAI, AutoGen, MCP servers, and 35+ others). Top 100 results per query, sorted by stars. Deduplicated and filtered to repos with 20+ stars, no forks.

Scanning

Each repo shallow-cloned and scanned with Inkog v1.1.0 using the comprehensive policy (all detectors, no confidence filtering). Results parsed and stored as structured JSON.

Analysis

Inkog's Universal IR engine converts any agent framework to a framework-agnostic intermediate representation. Detection rules, DFG taint analysis, and compliance mapping run on this unified IR.

Compliance Mapping

Every finding automatically mapped to EU AI Act articles, NIST AI RMF controls, and OWASP Agentic Top 10 entries. Governance scores computed for each repository.

Based on scanning 500+ repositories across every major AI agent framework. The only report backed by automated static analysis data — not surveys or interviews.

LangChainCrewAIAutoGenpydantic-aiLangGraphMCP ServersOpenAI Agentsn8nFlowiseDSPy

Get the full report

Original data, framework comparisons, compliance analysis, and remediation guidance — straight to your inbox.

Read the blog post

Original source

Hacker News AI Top

https://inkog.io/report

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

reportagent

ProductsLive

Cloud Observability vs Monitoring: What's the Difference and Why It Matters

Cloud Observability vs Monitoring: What's the Difference and Why It Matters Your alerting fires at 2 AM. CPU is at 94%, error rate is at 6.2%, and latency is climbing. You page the on-call engineer. They open the dashboard. They see the numbers going up. What they cannot see is why — because the service throwing errors depends on three upstream services, one of which depends on a database that is waiting on a connection pool that was quietly exhausted by a batch job that ran 11 minutes ago. Monitoring told you something was wrong. Observability would have told you what. This is not a semantic argument. Teams with mature observability resolve incidents 2.8x faster than teams that rely on monitoring alone, according to DORA research. The gap matters in production. Understanding why the gap e

DEV Community

9mabout 1 hour ago

Analyst NewsLive

Real-time emotion detection from webcam — no wearables needed

We’ve been running controlled trials with real-time facial affect analysis using nothing but a standard 720p webcam — no IR sensors, no EEG caps, no chest straps. The goal? Detect emotional valence and arousal with enough accuracy to be useful in high-stakes environments: remote proctoring, telehealth triage, UX research. Most open-source pipelines fail here because they treat emotion as a static classification problem. We treat it as a dynamic signal. Our stack uses a lightweight RetinaFace for detection, followed by a pruned EfficientNet-B0 fine-tuned on dynamic expressions from the AFEW and SEED datasets — not just static FER2013 junk. Temporal smoothing via a 1D causal CNN on top of softmax outputs reduces jitter and improves response latency under variable lighting. The real breakthro

DEV Community

2m44 minutes ago

ProductsLive

Takedown is not a ticket, but a campaign-suppression system

Most security teams still talk about takedown as if it were one workflow: detect a phishing page, file an abuse report, wait for the host or registrar, close the ticket, move on. That model was always too simple, and it is getting weaker. The better way to think about takedown is this: takedown is the process of reducing attacker operating time across the assets, channels, and trust surfaces a campaign depends on . If your process only removes one URL but leaves the spoofed number, the cloned social profile, the fake app listing, the paid ad, or the next domain in the chain untouched, you did not really suppress the campaign. You trimmed one branch. That distinction matters because modern phishing and scam operations are not domain-only problems. APWG recorded 892,494 phishing attacks in Q

DEV Community

7m32 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 220 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Analyst News

Analyst NewsLive

Real-time emotion detection from webcam — no wearables needed

DEV Community

2m44 minutes ago

Analyst NewsLive

Anxious days, sleepless nights for young Iranians in Hong Kong as war rages on

Life for Hong Kong-based Iranian biomedical researcher Behzad Nasiri Ahmadabadi is filled with anxiety as he spends each day waiting for a call from his family that may not come amid the conflict in the Middle East. The stress is similar for Iranian student Ali*, who spends his days scrolling through news reports from across the world to piece together events on the ground and lies awake at night thinking about what they mean. Young Iranians in Hong Kong are dealing with the conflict in a...

SCMP Tech (Asia AI)

1mabout 1 hour ago

Analyst NewsFresh

Huawei Chips Power DeepSeek’s Next AI Leap - Electronics For You BUSINESS

Huawei Chips Power DeepSeek’s Next AI Leap Electronics For You BUSINESS

GNews AI chips

1mabout 3 hours ago

Analyst NewsLive

Chinese Semiconductor Firms Post Record Sales Across Sectors - 조선일보

Chinese Semiconductor Firms Post Record Sales Across Sectors 조선일보

GNews AI chips

1mabout 2 hours ago