Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessThis International Fact-Checking Day, use these 5 tips to spot AI-generated contentFast Company TechBring state-of-the-art agentic skills to the edge with Gemma 4Google Developers BlogThe Corner-StoneLessWrongQuantum-Powered Crypto Mining Is Here—But It Won't Help You Mine BitcoinDecrypt AIv0.20.0-rc1: convert: support new Gemma4 audio_tower tensor naming (#15221)Ollama ReleasesAchieving Single-Digit Microsecond Latency Inference for Capital MarketsNVIDIA Tech BlogService Design in the Age of AI: Why Information Flow Is the New InterfaceMedium AIBringing AI Closer to the Edge and On-Device with Gemma 4NVIDIA Tech Blog5 Ways to Stop Writing Prompts and Start Programming AIMedium AIThe DisplacementMedium AIWorkerMill – open-source AI coding team, multi-expert orchestrationHacker News AI TopRewrites.bio: 60x speedup in Genomics QC and AI rewrite principles for ScienceHacker News AI TopBlack Hat USADark ReadingBlack Hat AsiaAI BusinessThis International Fact-Checking Day, use these 5 tips to spot AI-generated contentFast Company TechBring state-of-the-art agentic skills to the edge with Gemma 4Google Developers BlogThe Corner-StoneLessWrongQuantum-Powered Crypto Mining Is Here—But It Won't Help You Mine BitcoinDecrypt AIv0.20.0-rc1: convert: support new Gemma4 audio_tower tensor naming (#15221)Ollama ReleasesAchieving Single-Digit Microsecond Latency Inference for Capital MarketsNVIDIA Tech BlogService Design in the Age of AI: Why Information Flow Is the New InterfaceMedium AIBringing AI Closer to the Edge and On-Device with Gemma 4NVIDIA Tech Blog5 Ways to Stop Writing Prompts and Start Programming AIMedium AIThe DisplacementMedium AIWorkerMill – open-source AI coding team, multi-expert orchestrationHacker News AI TopRewrites.bio: 60x speedup in Genomics QC and AI rewrite principles for ScienceHacker News AI Top
AI NEWS HUBbyEIGENVECTOREigenvector

Cross-Model Disagreement as a Label-Free Correctness Signal

arXivMarch 26, 202610 min read0 views
Source Quiz

Detecting when a language model is wrong without ground truth labels is a fundamental challenge for safe deployment. Existing approaches rely on a model's own uncertainty -- such as token entropy or confidence scores -- but these signals fail critically on the most dangerous failure mode: confident errors, where a model is wrong but certain. In this work we introduce cross-model disagreement as a correctness indicator -- a simple, training-free signal that can be dropped into existing production systems, routing pipelines, and deployment monitoring infrastructure without modification. Given a — Matt Gorbett, Suman Jana

View PDF HTML (experimental)

Abstract:Detecting when a language model is wrong without ground truth labels is a fundamental challenge for safe deployment. Existing approaches rely on a model's own uncertainty -- such as token entropy or confidence scores -- but these signals fail critically on the most dangerous failure mode: confident errors, where a model is wrong but certain. In this work we introduce cross-model disagreement as a correctness indicator -- a simple, training-free signal that can be dropped into existing production systems, routing pipelines, and deployment monitoring infrastructure without modification. Given a model's generated answer, cross-model disagreement computes how surprised or uncertain a second verifier model is when reading that answer via a single forward pass. No generation from the verifying model is required, and no correctness labels are needed. We instantiate this principle as Cross-Model Perplexity (CMP), which measures the verifying model's surprise at the generating model's answer tokens, and Cross-Model Entropy (CME), which measures the verifying model's uncertainty at those positions. Both CMP and CME outperform within-model uncertainty baselines across benchmarks spanning reasoning, retrieval, and mathematical problem solving (MMLU, TriviaQA, and GSM8K). On MMLU, CMP achieves a mean AUROC of 0.75 against a within-model entropy baseline of 0.59. These results establish cross-model disagreement as a practical, training-free approach to label-free correctness estimation, with direct applications in deployment monitoring, model routing, selective prediction, data filtering, and scalable oversight of production language model systems.

Subjects:

Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.25450 [cs.AI]

(or arXiv:2603.25450v1 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2603.25450

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Matt Gorbett [view email] [v1] Thu, 26 Mar 2026 13:46:22 UTC (437 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Cross-Model…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 140 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!