Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessExclusive: Miravoice, Builder Of An AI ‘Interviewer’ To Conduct Phone Surveys, Raises $6.3MCrunchbase NewsMaul: Shadow Lord Will Return for Season 2GizmodoA jury says Meta and Google hurt a kid. What now?The Verge AIHow Disney Imagineers are using AI and robotics to reshape the company’s theme parksFast Company TechCapacity and speed: why TikTok shelved its second Irish data centreSilicon RepublicDiverse teams start with diverse VCsTechCrunch AIThis even smaller credit card-sized e-reader has one tragic flawThe VergeWhat history can teach us about AI - Johns Hopkins UniversityGNews AI USAContextCore: AI Agents conversations to an MCP-queryable memory layerDEV Community7 ways Dubai’s AI-powered government will change your daily life in the UAE - Gulf NewsGoogle News AI UAEI Built a 209-Page Sauna Site Without Knowing How to CodeDEV CommunityGoogle Home’s latest update makes Gemini better at understanding your commandsThe VergeBlack Hat USADark ReadingBlack Hat AsiaAI BusinessExclusive: Miravoice, Builder Of An AI ‘Interviewer’ To Conduct Phone Surveys, Raises $6.3MCrunchbase NewsMaul: Shadow Lord Will Return for Season 2GizmodoA jury says Meta and Google hurt a kid. What now?The Verge AIHow Disney Imagineers are using AI and robotics to reshape the company’s theme parksFast Company TechCapacity and speed: why TikTok shelved its second Irish data centreSilicon RepublicDiverse teams start with diverse VCsTechCrunch AIThis even smaller credit card-sized e-reader has one tragic flawThe VergeWhat history can teach us about AI - Johns Hopkins UniversityGNews AI USAContextCore: AI Agents conversations to an MCP-queryable memory layerDEV Community7 ways Dubai’s AI-powered government will change your daily life in the UAE - Gulf NewsGoogle News AI UAEI Built a 209-Page Sauna Site Without Knowing How to CodeDEV CommunityGoogle Home’s latest update makes Gemini better at understanding your commandsThe Verge
AI NEWS HUBbyEIGENVECTOREigenvector

MedLoc-R1: Performance-Aware Curriculum Reward Scheduling for GRPO-Based Medical Visual Grounding

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2603.28120v1 Announce Type: new Abstract: Medical visual grounding serves as a crucial foundation for fine-grained multimodal reasoning and interpretable clinical decision support. Despite recent advances in reinforcement learning (RL) for grounding tasks, existing approaches such as Group Relative Policy Optimization~(GRPO) suffer from severe reward sparsity when directly applied to medical images, primarily due to the inherent difficulty of localizing small or ambiguous regions of interest, which is further exacerbated by the rigid and suboptimal nature of fixed IoU-based reward scheme — Guangjing Yang, Ziyuan Qin, Chaoran Zhang, Chenlin Du, Jinlin Wang, Wanran Sun, Zhenyu Zhang, Bing Ji, Qicheng Lao

View PDF HTML (experimental)

Abstract:Medical visual grounding serves as a crucial foundation for fine-grained multimodal reasoning and interpretable clinical decision support. Despite recent advances in reinforcement learning (RL) for grounding tasks, existing approaches such as Group Relative Policy Optimization~(GRPO) suffer from severe reward sparsity when directly applied to medical images, primarily due to the inherent difficulty of localizing small or ambiguous regions of interest, which is further exacerbated by the rigid and suboptimal nature of fixed IoU-based reward schemes in RL. This leads to vanishing policy gradients and stagnated optimization, particularly during early training. To address this challenge, we propose MedLoc-R1, a performance-aware reward scheduling framework that progressively tightens the reward criterion in accordance with model readiness. MedLoc-R1 introduces a sliding-window performance tracker and a multi-condition update rule that automatically adjust the reward schedule from dense, easily obtainable signals to stricter, fine-grained localization requirements, while preserving the favorable properties of GRPO without introducing auxiliary networks or additional gradient paths. Experiments on three medical visual grounding benchmarks demonstrate that MedLoc-R1 consistently improves both localization accuracy and training stability over GRPO-based baselines. Our framework offers a general, lightweight, and effective solution for RL-based grounding in high-stakes medical applications. Code & checkpoints are available at \hyperlink{}{this https URL}.

Comments: 2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.28120 [cs.CV]

(or arXiv:2603.28120v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.28120

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Yang Guangjing [view email] [v1] Mon, 30 Mar 2026 07:31:21 UTC (2,202 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
MedLoc-R1: …researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Building knowledge graph…

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!