Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessWhen the Scraper Breaks Itself: Building a Self-Healing CSS Selector Repair SystemDEV CommunitySelf-Referential Generics in Kotlin: When Type Safety Requires Talking to YourselfDEV CommunitySources: Amazon is in talks to acquire Globalstar to bolster its low Earth orbit satellite business; Apple's 20% stake in Globalstar is a complicating factor (Financial Times)TechmemeZ.ai Launches GLM-5V-Turbo: A Native Multimodal Vision Coding Model Optimized for OpenClaw and High-Capacity Agentic Engineering Workflows EverywhereMarkTechPostHow I Started Using AI Agents for End-to-End Testing (Autonoma AI)DEV CommunityHow AI Is Changing PTSD Recovery — And Why It MattersDEV CommunityDeepSource vs Coverity: Static Analysis ComparedDEV CommunityClaude Code's Source Didn't Leak. It Was Already Public for Years.DEV CommunityStop Accepting BGP Routes on Trust Alone: Deploy RPKI ROV on IOS-XE and IOS XR TodayDEV CommunityI Built 5 SaaS Products in 7 Days Using AIDEV CommunitySingle-cell imaging and machine learning reveal hidden coordination in algae's response to light stress - MSNGoogle News: Machine LearningGoogle Dramatically Upgrades Storage in Google AI Pro - Thurrott.comGoogle News: GeminiBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessWhen the Scraper Breaks Itself: Building a Self-Healing CSS Selector Repair SystemDEV CommunitySelf-Referential Generics in Kotlin: When Type Safety Requires Talking to YourselfDEV CommunitySources: Amazon is in talks to acquire Globalstar to bolster its low Earth orbit satellite business; Apple's 20% stake in Globalstar is a complicating factor (Financial Times)TechmemeZ.ai Launches GLM-5V-Turbo: A Native Multimodal Vision Coding Model Optimized for OpenClaw and High-Capacity Agentic Engineering Workflows EverywhereMarkTechPostHow I Started Using AI Agents for End-to-End Testing (Autonoma AI)DEV CommunityHow AI Is Changing PTSD Recovery — And Why It MattersDEV CommunityDeepSource vs Coverity: Static Analysis ComparedDEV CommunityClaude Code's Source Didn't Leak. It Was Already Public for Years.DEV CommunityStop Accepting BGP Routes on Trust Alone: Deploy RPKI ROV on IOS-XE and IOS XR TodayDEV CommunityI Built 5 SaaS Products in 7 Days Using AIDEV CommunitySingle-cell imaging and machine learning reveal hidden coordination in algae's response to light stress - MSNGoogle News: Machine LearningGoogle Dramatically Upgrades Storage in Google AI Pro - Thurrott.comGoogle News: Gemini

Improving Clinical Diagnosis with Counterfactual Multi-Agent Reasoning

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2603.27820v1 Announce Type: new Abstract: Clinical diagnosis is a complex reasoning process in which clinicians gather evidence, form hypotheses, and test them against alternative explanations. In medical training, this reasoning is explicitly developed through counterfactual questioning--e.g., asking how a diagnosis would change if a key symptom were absent or altered--to strengthen differential diagnosis skills. As large language model (LLM)-based systems are increasingly used for diagnostic support, ensuring the interpretability of their recommendations becomes critical. However, most — Zhiwen You, Xi Chen, Aniket Vashishtha, Simo Du, Gabriel Erion-Barner, Hongyuan Mei, Hao Peng, Yue Guo

View PDF HTML (experimental)

Abstract:Clinical diagnosis is a complex reasoning process in which clinicians gather evidence, form hypotheses, and test them against alternative explanations. In medical training, this reasoning is explicitly developed through counterfactual questioning--e.g., asking how a diagnosis would change if a key symptom were absent or altered--to strengthen differential diagnosis skills. As large language model (LLM)-based systems are increasingly used for diagnostic support, ensuring the interpretability of their recommendations becomes critical. However, most existing LLM-based diagnostic agents reason over fixed clinical evidence without explicitly testing how individual findings support or weaken competing diagnoses. In this work, we propose a counterfactual multi-agent diagnostic framework inspired by clinician training that makes hypothesis testing explicit and evidence-grounded. Our framework introduces counterfactual case editing to modify clinical findings and evaluate how these changes affect competing diagnoses. We further define the Counterfactual Probability Gap, a method that quantifies how strongly individual findings support a diagnosis by measuring confidence shifts under these edits. These counterfactual signals guide multi-round specialist discussions, enabling agents to challenge unsupported hypotheses, refine differential diagnoses, and produce more interpretable reasoning trajectories. Across three diagnostic benchmarks and seven LLMs, our method consistently improves diagnostic accuracy over prompting and prior multi-agent baselines, with the largest gains observed in complex and ambiguous cases. Human evaluation further indicates that our framework produces more clinically useful, reliable, and coherent reasoning. These results suggest that incorporating counterfactual evidence verification is an important step toward building reliable AI systems for clinical decision support.

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2603.27820 [cs.CL]

(or arXiv:2603.27820v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.27820

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Zhiwen You [view email] [v1] Sun, 29 Mar 2026 19:14:26 UTC (11,675 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Improving C…researchpaperarxivnlplanguage-mo…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 195 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers