Research Papers research paper arxiv nlp language-models

Improving Clinical Diagnosis with Counterfactual Multi-Agent Reasoning

arXivMarch 31, 20262 min read0 views

arXiv:2603.27820v1 Announce Type: new Abstract: Clinical diagnosis is a complex reasoning process in which clinicians gather evidence, form hypotheses, and test them against alternative explanations. In medical training, this reasoning is explicitly developed through counterfactual questioning--e.g., asking how a diagnosis would change if a key symptom were absent or altered--to strengthen differential diagnosis skills. As large language model (LLM)-based systems are increasingly used for diagnostic support, ensuring the interpretability of their recommendations becomes critical. However, most — Zhiwen You, Xi Chen, Aniket Vashishtha, Simo Du, Gabriel Erion-Barner, Hongyuan Mei, Hao Peng, Yue Guo

View PDF HTML (experimental)

Abstract:Clinical diagnosis is a complex reasoning process in which clinicians gather evidence, form hypotheses, and test them against alternative explanations. In medical training, this reasoning is explicitly developed through counterfactual questioning--e.g., asking how a diagnosis would change if a key symptom were absent or altered--to strengthen differential diagnosis skills. As large language model (LLM)-based systems are increasingly used for diagnostic support, ensuring the interpretability of their recommendations becomes critical. However, most existing LLM-based diagnostic agents reason over fixed clinical evidence without explicitly testing how individual findings support or weaken competing diagnoses. In this work, we propose a counterfactual multi-agent diagnostic framework inspired by clinician training that makes hypothesis testing explicit and evidence-grounded. Our framework introduces counterfactual case editing to modify clinical findings and evaluate how these changes affect competing diagnoses. We further define the Counterfactual Probability Gap, a method that quantifies how strongly individual findings support a diagnosis by measuring confidence shifts under these edits. These counterfactual signals guide multi-round specialist discussions, enabling agents to challenge unsupported hypotheses, refine differential diagnoses, and produce more interpretable reasoning trajectories. Across three diagnostic benchmarks and seven LLMs, our method consistently improves diagnostic accuracy over prompting and prior multi-agent baselines, with the largest gains observed in complex and ambiguous cases. Human evaluation further indicates that our framework produces more clinically useful, reliable, and coherent reasoning. These results suggest that incorporating counterfactual evidence verification is an important step toward building reliable AI systems for clinical decision support.

Subjects:

Computation and Language (cs.CL)

Cite as: arXiv:2603.27820 [cs.CL]

(or arXiv:2603.27820v1 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2603.27820

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Zhiwen You [view email] [v1] Sun, 29 Mar 2026 19:14:26 UTC (11,675 KB)

Original source

arXiv

https://arxiv.org/abs/2603.27820

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

ModelsLive

Google's TurboQuant saves memory, but won't save us from DRAM-pricing hell

<h4>Chocolate Factory’s compression tech clears the way to cheaper AI inference, not more affordable memory</h4> When Google unveiled <a target="_blank" rel="nofollow" href="https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/">TurboQuant</a>, an AI data compression technology that promises to slash the amount of memory required to serve models, many hoped it would help with a memory shortage that has seen prices triple since last year. Not so much.…

The Register AI/ML

1mabout 1 hour ago

Research PapersFresh

Illinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!

<a href="https://news.google.com/rss/articles/CBMiXEFVX3lxTE13OVpWMEk1Z3hlMkR2bHNBQ2dkazFwb3VqN3hCa29GWGJvSVlPa00zd2xUakRmYXFqQmc5OWU0eGl4a21FMDAwWUN2Q3p0M3FrbXBkNV8zN0cxaG1s?oc=5" target="_blank">Illinois Tech computer science researcher honored by IEEE Chicago Section</a> EurekAlert!

Google News: Machine Learning

1mabout 2 hours ago

ProductsFresh

My Journey to becoming a Quantum Engineer

I have procrastinated on documenting this process for the longest time. But I think i am ready now (maybe). Coming from a front end engineering background, I am fascinated by the work being done by the quantum engineers at IBM. I am not that great with maths and statistics but I believe anything can be learned with tons of practice and consistency. I want to use this platform to hold myself accountable (that is if i don't give up half way and delete all my posts. I'll try not to btw). This is an article describing <a href="https://www.ibm.com/think/topics/quantum-computing" rel="noopener noreferrer">what quantum computing is</a> and some of it's use cases. I became an IBM qiskit advocate late last year and I have been exposed to a lot of resources and networked a bun

DEV Community

2mabout 2 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 195 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersFresh

Illinois Tech computer science researcher honored by IEEE Chicago Section - EurekAlert!

Google News: Machine Learning

1mabout 2 hours ago

Research PapersFresh

Research roundup: 7 cool science stories we almost missed

Ars Technica

1mabout 4 hours ago

Research PapersFresh

AI maps science papers to predict research trends two to three years ahead - Tech Xplore

<a href="https://news.google.com/rss/articles/CBMie0FVX3lxTE5aTkZYTWdaRDZwTXNRMldpMG1WZ1YzWDZTOHN5M183Z3A1ZTFYbnhEWTdPRmpvZnZFU0xodlRsNWxFaGxTcEpwalhJNmJpQWE5VjhaRS1tOXJIeTc5Z0JNblJ3dFd4WjRYZGJOX0NrWGt6ZmZJVTBpRm5wWQ?oc=5" target="_blank">AI maps science papers to predict research trends two to three years ahead</a> Tech Xplore

Google News: Machine Learning

1mabout 4 hours ago

Research PapersFresh

AI inspires new research topics in materials science - Nanowerk

<a href="https://news.google.com/rss/articles/CBMiZ0FVX3lxTFBPWlJSM2ExeVQ3LVppTm45NHpEMW9YVkxscThCNDd2OVB0c3J1ZmVCbWNSZWZ0TjZwSzlOdEFXN2UtRk5LU1hxdXd4ZklldGxoM0FZSnhCd19PWkNHQ1ZRVDNwSHNUSk0?oc=5" target="_blank">AI inspires new research topics in materials science</a> Nanowerk

GNews AI materials

1mabout 12 hours ago