Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessInside the push to make every employee an AI masterBusiness InsiderHow Rust's Ownership Model Prevents Bugs — A Visual GuideDEV CommunityThe Eve of Gentle Singularity: A Short StoryLessWrong AIAnthropic releases part of AI tool source code in 'error'TechXplore AIPrograms Beat Prompts: AI Forges Deterministic Interface Programs That Run ForeverDEV CommunityThe new American Dream: owning just part of a homeBusiness InsiderHow to stay relevant as a developerDEV CommunityI Built 24+ Free Developer Tools That Run in Your Browser — Here's the Full StackDEV CommunityMCMC Island Hopping: An Intuitive Guide to the Metropolis-Hastings AlgorithmDEV CommunityThe Iran war could haunt grocery bills long after the fighting stopsBusiness InsiderOracle cut thousands of jobs in recent round of layoffs – CNBCSilicon RepublicAnthropic admits partial leak of Claude Code source, says no customer data exposed - Storyboard18Google News: ClaudeBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessInside the push to make every employee an AI masterBusiness InsiderHow Rust's Ownership Model Prevents Bugs — A Visual GuideDEV CommunityThe Eve of Gentle Singularity: A Short StoryLessWrong AIAnthropic releases part of AI tool source code in 'error'TechXplore AIPrograms Beat Prompts: AI Forges Deterministic Interface Programs That Run ForeverDEV CommunityThe new American Dream: owning just part of a homeBusiness InsiderHow to stay relevant as a developerDEV CommunityI Built 24+ Free Developer Tools That Run in Your Browser — Here's the Full StackDEV CommunityMCMC Island Hopping: An Intuitive Guide to the Metropolis-Hastings AlgorithmDEV CommunityThe Iran war could haunt grocery bills long after the fighting stopsBusiness InsiderOracle cut thousands of jobs in recent round of layoffs – CNBCSilicon RepublicAnthropic admits partial leak of Claude Code source, says no customer data exposed - Storyboard18Google News: Claude

FACTUM: Mechanistic Detection of Citation Hallucination in Long-Form RAG

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2601.05866v4 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) models are critically undermined by citation hallucinations, a deceptive failure where a model cites a source that fails to support its claim. While existing work attributes hallucination to a simple over-reliance on parametric knowledge, we reframe this failure as an evolving, scale-dependent coordination failure between the Attention (reading) and Feed-Forward Network (recalling) pathways. We introduce FACTUM (Framework for Attesting Citation Trustworthiness via Underlying Mechanisms), a framework of fou — Maxime Dassen, Rebecca Kotula, Kenton Murray, Andrew Yates, Dawn Lawrie, Efsun Kayi, James Mayfield, Kevin Duh

View PDF HTML (experimental)

Abstract:Retrieval-Augmented Generation (RAG) models are critically undermined by citation hallucinations, a deceptive failure where a model cites a source that fails to support its claim. While existing work attributes hallucination to a simple over-reliance on parametric knowledge, we reframe this failure as an evolving, scale-dependent coordination failure between the Attention (reading) and Feed-Forward Network (recalling) pathways. We introduce FACTUM (Framework for Attesting Citation Trustworthiness via Underlying Mechanisms), a framework of four mechanistic scores: Contextual Alignment (CAS), Attention Sink Usage (BAS), Parametric Force (PFS), and Pathway Alignment (PAS). Our analysis reveals that correct citations are consistently marked by higher parametric force (PFS) and greater use of the attention sink (BAS) for information synthesis. Crucially, we find that "one-size-fits-all" theories are insufficient as the signature of correctness evolves with scale: while the 3B model relies on high pathway alignment (PAS), our best-performing 8B detector identifies a shift toward a specialized strategy where pathways provide distinct, orthogonal information. By capturing this complex interplay, FACTUM outperforms state-of-the-art baselines by up to 37.5% in AUC. Our results demonstrate that high parametric force is constructive when successfully coordinated with the Attention pathway, paving the way for more nuanced and reliable RAG systems.

Comments: Accepted at ECIR 2026. 13 pages, 2 figures

Subjects:

Computation and Language (cs.CL)

ACM classes: H.3.3; I.2.7

Cite as: arXiv:2601.05866 [cs.CL]

(or arXiv:2601.05866v4 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2601.05866

arXiv-issued DOI via DataCite

Submission history

From: Maxime Dassen [view email] [v1] Fri, 9 Jan 2026 15:41:08 UTC (1,047 KB) [v2] Fri, 16 Jan 2026 13:21:03 UTC (1,064 KB) [v3] Mon, 23 Mar 2026 08:36:12 UTC (1,065 KB) [v4] Sun, 29 Mar 2026 07:00:05 UTC (1,065 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
FACTUM: Mec…researchpaperarxivnlplanguage-mo…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 204 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers