Towards a Medical AI Scientist
arXiv:2603.28589v1 Announce Type: new Abstract: Autonomous systems that generate scientific hypotheses, conduct experiments, and draft manuscripts have recently emerged as a promising paradigm for accelerating discovery. However, existing AI Scientists remain largely domain-agnostic, limiting their applicability to clinical medicine, where research is required to be grounded in medical evidence with specialized data modalities. In this work, we introduce Medical AI Scientist, the first autonomous research framework tailored to clinical autonomous research. It enables clinically grounded ideati — Hongtao Wu, Boyun Zheng, Dingjie Song, Yu Jiang, Jianfeng Gao, Lei Xing, Lichao Sun, Yixuan Yuan
View PDF HTML (experimental)
Abstract:Autonomous systems that generate scientific hypotheses, conduct experiments, and draft manuscripts have recently emerged as a promising paradigm for accelerating discovery. However, existing AI Scientists remain largely domain-agnostic, limiting their applicability to clinical medicine, where research is required to be grounded in medical evidence with specialized data modalities. In this work, we introduce Medical AI Scientist, the first autonomous research framework tailored to clinical autonomous research. It enables clinically grounded ideation by transforming extensively surveyed literature into actionable evidence through clinician-engineer co-reasoning mechanism, which improves the traceability of generated research ideas. It further facilitates evidence-grounded manuscript drafting guided by structured medical compositional conventions and ethical policies. The framework operates under 3 research modes, namely paper-based reproduction, literature-inspired innovation, and task-driven exploration, each corresponding to a distinct level of automated scientific inquiry with progressively increasing autonomy. Comprehensive evaluations by both large language models and human experts demonstrate that the ideas generated by the Medical AI Scientist are of substantially higher quality than those produced by commercial LLMs across 171 cases, 19 clinical tasks, and 6 data modalities. Meanwhile, our system achieves strong alignment between the proposed method and its implementation, while also demonstrating significantly higher success rates in executable experiments. Double-blind evaluations by human experts and the Stanford Agentic Reviewer suggest that the generated manuscripts approach MICCAI-level quality, while consistently surpassing those from ISBI and BIBM. The proposed Medical AI Scientist highlights the potential of leveraging AI for autonomous scientific discovery in healthcare.
Subjects:
Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as: arXiv:2603.28589 [cs.AI]
(or arXiv:2603.28589v1 [cs.AI] for this version)
https://doi.org/10.48550/arXiv.2603.28589
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Hongtao Wu [view email] [v1] Mon, 30 Mar 2026 15:37:25 UTC (7,830 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxiv
Bankai (卍解) — the first post-training adaptation method for true 1-bit LLMs.
I've been experimenting with Bonsai 8B — PrismML's true 1-bit model (every weight is literally 0 or 1, not ternary like BitNet). I realized that since weights are bits, the diff between two model behaviors is just a XOR mask. So I built a tool that searches for sparse XOR patches that modify model behavior. The basic idea: flip a row of weights, check if the model got better at the target task without breaking anything else, keep or revert. The set of accepted flips is the patch. What it does on held-out prompts the search never saw: Without patch: d/dx [x^7 + x] = 0 ✗ With patch: d/dx [x^7 + x] = 7x^6 + 1 ✓ Without patch: Is 113 prime? No, 113 is not prime ✗ With patch: Is 113 prime? Yes, 113 is a prime number ✓ 93 row flips. 0.007% of weights. ~1 KB. Zero inference overhead — the patched

In the Presence of the Minister of Energy, Cisco and King Abdullah University of Science and Technology (KAUST) launch landmark AI Institute to accelerate AI research, development, and talent in Saudi Arabia - Cisco Newsroom
In the Presence of the Minister of Energy, Cisco and King Abdullah University of Science and Technology (KAUST) launch landmark AI Institute to accelerate AI research, development, and talent in Saudi Arabia Cisco Newsroom
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers

Quantum computers might crack today's encryption far sooner than we thought
According to a study by engineers at Caltech and the UC Department of Physics, quantum computers do not need to be nearly as powerful as previously believed to crack the most advanced cryptographic technologies. The research claims that Shor's algorithm could break RSA public-key encryption using quantum computers with just... Read Entire Article



Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!