Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessBoston Becomes First Major District to Bring AI Literacy Into Classrooms - GoverningGoogle News: AIHow payment fraud evolved from ancient Roman coins to AI-deepfakes — and what's next - The Business JournalsGNews AI deepfakeOracle Lays Off Thousands to Offset AI SpendingGizmodoFranklin Templeton agrees to acquire CoinFund spinoff 250 Digital to form Franklin Crypto, which will offer strategies designed for institutional investors (Vicky Ge Huang/Wall Street Journal)TechmemeDeveloper’s Guide to Building ADK Agents with SkillsGoogle Developers BlogUMW Inaugural AI Expert-in-Residence Shares Insight on Technology’s ‘Tremendous’ Impact - University of Mary WashingtonGoogle News: AISpaceX Said to File Confidentially for IPO Before AI RivalsBloomberg TechnologyCargill Wins 2026 BIG Artificial Intelligence Excellence Award - foodmarket.comGoogle News: AIWhen machines judge without knowing: AI, augmentation and the limits of automated cybersecurity decisions - IAPPGNews AI cybersecurityMeet the Agentic AI Design-to-Source Workspace for PLM: From CAD to Confident Sourcing Decisions - Oracle BlogsGNews AI agenticYouTube blasted by hundreds of experts over ‘AI slop’ videos served up to kidsFast Company TechApono Uses Gamified AI Security Exercise to Engage Cloud Security Community - TipRanksGoogle News: AI SafetyBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessBoston Becomes First Major District to Bring AI Literacy Into Classrooms - GoverningGoogle News: AIHow payment fraud evolved from ancient Roman coins to AI-deepfakes — and what's next - The Business JournalsGNews AI deepfakeOracle Lays Off Thousands to Offset AI SpendingGizmodoFranklin Templeton agrees to acquire CoinFund spinoff 250 Digital to form Franklin Crypto, which will offer strategies designed for institutional investors (Vicky Ge Huang/Wall Street Journal)TechmemeDeveloper’s Guide to Building ADK Agents with SkillsGoogle Developers BlogUMW Inaugural AI Expert-in-Residence Shares Insight on Technology’s ‘Tremendous’ Impact - University of Mary WashingtonGoogle News: AISpaceX Said to File Confidentially for IPO Before AI RivalsBloomberg TechnologyCargill Wins 2026 BIG Artificial Intelligence Excellence Award - foodmarket.comGoogle News: AIWhen machines judge without knowing: AI, augmentation and the limits of automated cybersecurity decisions - IAPPGNews AI cybersecurityMeet the Agentic AI Design-to-Source Workspace for PLM: From CAD to Confident Sourcing Decisions - Oracle BlogsGNews AI agenticYouTube blasted by hundreds of experts over ‘AI slop’ videos served up to kidsFast Company TechApono Uses Gamified AI Security Exercise to Engage Cloud Security Community - TipRanksGoogle News: AI Safety

MedOpenClaw: Auditable Medical Imaging Agents Reasoning over Uncurated Full Studies

HuggingFace PapersMarch 25, 20268 min read0 views
Source Quiz

MEDOPENCLAW and MEDFLOWBENCH enable evaluation of vision-language models in medical imaging by allowing dynamic interaction with 3D medical volumes through standard viewers, revealing limitations in spatial reasoning when accessing professional tools. (0 upvotes on HuggingFace)

Published on Mar 25

·

Submitted by

liu

on Mar 30

Authors:

,

,

,

,

,

,

,

,

,

Abstract

MEDOPENCLAW and MEDFLOWBENCH enable evaluation of vision-language models in medical imaging by allowing dynamic interaction with 3D medical volumes through standard viewers, revealing limitations in spatial reasoning when accessing professional tools.

AI-generated summary

Currently, evaluating vision-language models (VLMs) in medical imaging tasks oversimplifies clinical reality by relying on pre-selected 2D images that demand significant manual labor to curate. This setup misses the core challenge of realworld diagnostics: a true clinical agent must actively navigate full 3D volumes across multiple sequences or modalities to gather evidence and ultimately support a final decision. To address this, we propose MEDOPENCLAW, an auditable runtime designed to let VLMs operate dynamically within standard medical tools or viewers (e.g., 3D Slicer). On top of this runtime, we introduce MEDFLOWBENCH, a full-study medical imaging benchmark covering multi-sequence brain MRI and lung CT/PET. It systematically evaluates medical agentic capabilities across viewer-only, tool-use, and open-method tracks. Initial results reveal a critical insight: while state-of-the-art LLMs/VLMs (e.g., Gemini 3.1 Pro and GPT-5.4) can successfully navigate the viewer to solve basic study-level tasks, their performance paradoxically degrades when given access to professional support tools due to a lack of precise spatial grounding. By bridging the gap between static-image perception and interactive clinical workflows, MEDOPENCLAW and MEDFLOWBENCH establish a reproducible foundation for developing auditable, full-study medical imaging agents.

View arXiv page View PDF Project page Add to collection

Get this paper in your agent:

hf papers read 2603.24649

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.24649 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.24649 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.24649 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
MedOpenClaw…researchpaperarxivvision-lang…medical ima…3D volumesHuggingFace…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 81 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers