Models model language model announce available arxiv findings

Towards Reliable Truth-Aligned Uncertainty Estimation in Large Language Models

ArXiv CS.AIby Ponhvoan Srey, Quang Minh Nguyen, Xiaobao Wu, Anh Tuan LuuApril 2, 20261 min read0 views

arXiv:2604.00445v1 Announce Type: new Abstract: Uncertainty estimation (UE) aims to detect hallucinated outputs of large language models (LLMs) to improve their reliability. However, UE metrics often exhibit unstable performance across configurations, which significantly limits their applicability. In this work, we formalise this phenomenon as proxy failure, since most UE metrics originate from model behaviour, rather than being explicitly grounded in the factual correctness of LLM outputs. With this, we show that UE metrics become non-discriminative precisely in low-information regimes. To alleviate this, we propose Truth AnChoring (TAC), a post-hoc calibration method to remedy UE metrics, by mapping the raw scores to truth-aligned scores. Even with noisy and few-shot supervision, our TAC

View PDF HTML (experimental)

Abstract:Uncertainty estimation (UE) aims to detect hallucinated outputs of large language models (LLMs) to improve their reliability. However, UE metrics often exhibit unstable performance across configurations, which significantly limits their applicability. In this work, we formalise this phenomenon as proxy failure, since most UE metrics originate from model behaviour, rather than being explicitly grounded in the factual correctness of LLM outputs. With this, we show that UE metrics become non-discriminative precisely in low-information regimes. To alleviate this, we propose Truth AnChoring (TAC), a post-hoc calibration method to remedy UE metrics, by mapping the raw scores to truth-aligned scores. Even with noisy and few-shot supervision, our TAC can support the learning of well-calibrated uncertainty estimates, and presents a practical calibration protocol. Our findings highlight the limitations of treating heuristic UE metrics as direct indicators of truth uncertainty, and position our TAC as a necessary step toward more reliable uncertainty estimation for LLMs. The code repository is available at this https URL.

Subjects:

Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

Cite as: arXiv:2604.00445 [cs.AI]

(or arXiv:2604.00445v1 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2604.00445

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Ponhvoan Srey [view email] [v1] Wed, 1 Apr 2026 03:42:16 UTC (6,423 KB)

Original source

ArXiv CS.AI

https://arxiv.org/abs/2604.00445

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modelannounce

ModelsLive

[P] Remote sensing foundation models made easy to use.

This project enables the idea of tasking remote sensing models to acquire embeddings like we task satellites to acquire data! https://github.com/cybergis/rs-embed submitted by /u/amritk110 [link] [comments]

Reddit r/MachineLearning

1mabout 1 hour ago

ModelsLive

Google introduces Gemma 4 open-source AI model - AzerNews

Google introduces Gemma 4 open-source AI model AzerNews

GNews AI Gemma

1mabout 2 hours ago

ReleasesLive

Microsoft just shipped the clearest signal yet that it is building an AI empire without OpenAI

Six months after renegotiating the contract that once barred it from independently pursuing frontier AI, Microsoft has released three in-house models that directly challenge the partner it spent $13 billion cultivating. MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 are now available in Microsoft Foundry, and they do not carry OpenAI’s name anywhere on the label. The models are [ ] This story continues at The Next Web

The Next Web Neural

5mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 124 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

Towards Reliable Truth-Aligned Uncertainty Estimation in Large Language Models

Submission history

Daily AI Digest

More about

[P] Remote sensing foundation models made easy to use.

Google introduces Gemma 4 open-source AI model - AzerNews

Microsoft just shipped the clearest signal yet that it is building an AI empire without OpenAI

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Models

Google Tweaks Gemini Pricing To Cut AI Costs - GuruFocus

[P] Remote sensing foundation models made easy to use.

Google introduces Gemma 4 open-source AI model - AzerNews

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ