Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessHow We Built an EdTech Platform That Scaled to 250K Daily UsersDEV CommunityRoguelike Devlog: Redesigning a Game UI With an AI 2D Game MakerDEV CommunityI spent days debugging a cron job that was "working fine"DEV CommunityLLM Agents Need a Nervous System, Not Just a BrainDEV CommunityThe 22,000 Token Tax: Why I Killed My MCP ServerDEV CommunityOpenSpec (Spec-Driven Development) Failed My Experiment — Instructions.md Was Simpler and FasterDEV CommunityI Asked AI to Do Agile Sprint Planning (GitHub Copilot Test)DEV Community🌪️ Proof of Work: The To-Do List of Infinite RegretDEV CommunityShaping the UAE’s Digital Destiny: Building Sovereignty, Trust, and Resilience in the Cyber EraEE TimesA new dating app, Sonder, has a deliberately annoying sign-up process (and it’s working)TechCrunchPromoting late-gameplay BG3 composition contracts in the TD2 SDL portDEV CommunityArtificial Intelligence Is Facing a Crisis of Control—and the Industry Knows It - Council on Foreign RelationsGoogle News: AI SafetyBlack Hat USADark ReadingBlack Hat AsiaAI BusinessHow We Built an EdTech Platform That Scaled to 250K Daily UsersDEV CommunityRoguelike Devlog: Redesigning a Game UI With an AI 2D Game MakerDEV CommunityI spent days debugging a cron job that was "working fine"DEV CommunityLLM Agents Need a Nervous System, Not Just a BrainDEV CommunityThe 22,000 Token Tax: Why I Killed My MCP ServerDEV CommunityOpenSpec (Spec-Driven Development) Failed My Experiment — Instructions.md Was Simpler and FasterDEV CommunityI Asked AI to Do Agile Sprint Planning (GitHub Copilot Test)DEV Community🌪️ Proof of Work: The To-Do List of Infinite RegretDEV CommunityShaping the UAE’s Digital Destiny: Building Sovereignty, Trust, and Resilience in the Cyber EraEE TimesA new dating app, Sonder, has a deliberately annoying sign-up process (and it’s working)TechCrunchPromoting late-gameplay BG3 composition contracts in the TD2 SDL portDEV CommunityArtificial Intelligence Is Facing a Crisis of Control—and the Industry Knows It - Council on Foreign RelationsGoogle News: AI Safety

Security in LLM-as-a-Judge: A Comprehensive SoK

arXiv cs.CRby Aiman Almasoud, Antony Anju, Marco Arazzi, Mert Cihangiroglu, Vignesh Kumar Kembu, Serena Nicolazzo, Antonino Nocera, Vinod P., Saraga SakthidharanApril 1, 20262 min read0 views
Source Quiz

arXiv:2603.29403v1 Announce Type: new Abstract: LLM-as-a-Judge (LaaJ) is a novel paradigm in which powerful language models are used to assess the quality, safety, or correctness of generated outputs. While this paradigm has significantly improved the scalability and efficiency of evaluation processes, it also introduces novel security risks and reliability concerns that remain largely unexplored. In particular, LLM-based judges can become both targets of adversarial manipulation and instruments through which attacks are conducted, potentially compromising the trustworthiness of evaluation pipelines. In this paper, we present the first Systematization of Knowledge (SoK) focusing on the security aspects of LLM-as-a-Judge systems. We perform a comprehensive literature review across major aca

View PDF HTML (experimental)

Abstract:LLM-as-a-Judge (LaaJ) is a novel paradigm in which powerful language models are used to assess the quality, safety, or correctness of generated outputs. While this paradigm has significantly improved the scalability and efficiency of evaluation processes, it also introduces novel security risks and reliability concerns that remain largely unexplored. In particular, LLM-based judges can become both targets of adversarial manipulation and instruments through which attacks are conducted, potentially compromising the trustworthiness of evaluation pipelines. In this paper, we present the first Systematization of Knowledge (SoK) focusing on the security aspects of LLM-as-a-Judge systems. We perform a comprehensive literature review across major academic databases, analyzing 863 works and selecting 45 relevant studies published between 2020 and 2026. Based on this study, we propose a taxonomy that organizes recent research according to the role played by LLM-as-a-Judge in the security landscape, distinguishing between attacks targeting LaaJ systems, attacks performed through LaaJ, defenses leveraging LaaJ for security purposes, and applications where LaaJ is used as an evaluation strategy in security-related domains. We further provide a comparative analysis of existing approaches, highlighting current limitations, emerging threats, and open research challenges. Our findings reveal significant vulnerabilities in LLM-based evaluation frameworks, as well as promising directions for improving their robustness and reliability. Finally, we outline key research opportunities that can guide the development of more secure and trustworthy LLM-as-a-Judge systems.

Subjects:

Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.29403 [cs.CR]

(or arXiv:2603.29403v1 [cs.CR] for this version)

https://doi.org/10.48550/arXiv.2603.29403

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Serena Nicolazzo Dr [view email] [v1] Tue, 31 Mar 2026 08:05:54 UTC (104 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modelannounce

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Security in…modellanguage mo…announceapplicationvaluationanalysisarXiv cs.CR

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 199 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers