Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessNvidia Stock Rises. This Issue Could Hamper Its Next-Generation AI Chips. - Barron'sGNews AI NVIDIABroadcom's CEO Has Line of Sight to $100 Billion in AI Chip Revenue. Is the Stock a Buy? - The Motley FoolGoogle News: AII gave Claude Code our entire codebase. Our customers noticed. | Al Chen (Galileo)lennysnewsletter.comGoogle DeepMind and Agile Robotics Combine Robotics Platforms - Automation WorldGoogle News: DeepMindBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AIBuilding a Resume & Portfolio Platform with Next.js and ReactDev.to AIWhy AI-Powered Ecommerce Website Development Is the New Competitive Edge in 2026Dev.to AIFAQs on Visionary AI: Transforming the Future of InnovationDev.to AIDid AMD Just Beat Nvidia In AI Performance? - ForbesGNews AI NVIDIANvidia and Google are the safest AI bets in public markets: Intelligent Alpha CEO Doug Clinton - CNBCGNews AI NVIDIAOnly 20% of MCP Servers Are 'A-Grade' Secure — Here's How to Vet Them Before InstallingDev.to AIThe Senior Engineer's Guide to CLAUDE.md: From Generic to ActionableDev.to AIBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessNvidia Stock Rises. This Issue Could Hamper Its Next-Generation AI Chips. - Barron'sGNews AI NVIDIABroadcom's CEO Has Line of Sight to $100 Billion in AI Chip Revenue. Is the Stock a Buy? - The Motley FoolGoogle News: AII gave Claude Code our entire codebase. Our customers noticed. | Al Chen (Galileo)lennysnewsletter.comGoogle DeepMind and Agile Robotics Combine Robotics Platforms - Automation WorldGoogle News: DeepMindBig Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.Dev.to AIBuilding a Resume & Portfolio Platform with Next.js and ReactDev.to AIWhy AI-Powered Ecommerce Website Development Is the New Competitive Edge in 2026Dev.to AIFAQs on Visionary AI: Transforming the Future of InnovationDev.to AIDid AMD Just Beat Nvidia In AI Performance? - ForbesGNews AI NVIDIANvidia and Google are the safest AI bets in public markets: Intelligent Alpha CEO Doug Clinton - CNBCGNews AI NVIDIAOnly 20% of MCP Servers Are 'A-Grade' Secure — Here's How to Vet Them Before InstallingDev.to AIThe Senior Engineer's Guide to CLAUDE.md: From Generic to ActionableDev.to AI
AI NEWS HUBbyEIGENVECTOREigenvector

LoGSAM: Parameter-Efficient Cross-Modal Grounding for MRI Segmentation

arXivby [Submitted on 18 Mar 2026 (v1), last revised 27 Mar 2026 (this version, v2)]March 30, 20262 min read1 views
Source Quiz

arXiv:2603.17576v2 Announce Type: replace Abstract: Precise localization and delineation of brain tumors using Magnetic Resonance Imaging (MRI) are essential for planning therapy and guiding surgical decisions. However, most existing approaches rely on task-specific supervised models and are constrained by the limited availability of annotated data. To address this, we propose LoGSAM, a parameter-efficient, detection-driven framework that transforms radiologist dictation into text prompts for foundation-model-based localization and segmentation. Radiologist speech is first transcribed and tran — Mohammad Robaitul Islam Bhuiyan, Sheethal Bhat, Melika Qahqaie, Tri-Thien Nguyen, Paula Andrea Perez-Toro, Tomas Arias-Vergara, Andreas Maier

View PDF HTML (experimental)

Abstract:Precise localization and delineation of brain tumors using Magnetic Resonance Imaging (MRI) are essential for planning therapy and guiding surgical decisions. However, most existing approaches rely on task-specific supervised models and are constrained by the limited availability of annotated data. To address this, we propose LoGSAM, a parameter-efficient, detection-driven framework that transforms radiologist dictation into text prompts for foundation-model-based localization and segmentation. Radiologist speech is first transcribed and translated using a pretrained Whisper ASR model, followed by negation-aware clinical NLP to extract tumor-specific textual prompts. These prompts guide text-conditioned tumor localization via a LoRA-adapted vision-language detection model, Grounding DINO (GDINO). The LoRA adaptation updates using 5% of the model parameters, thereby enabling computationally efficient domain adaptation while preserving pretrained cross-modal knowledge. The predicted bounding boxes are used as prompts for MedSAM to generate pixel-level tumor masks without any additional fine-tuning. Conditioning the frozen MedSAM on LoGSAM-derived priors yields a state-of-the-art dice score of 80.32% on BRISC 2025. In addition, we evaluate the full pipeline using German dictations from a board-certified radiologist on 12 unseen MRI scans, achieving 91.7% case-level accuracy. These results highlight the feasibility of constructing a modular, speech-to-segmentation pipeline by intelligently leveraging pretrained foundation models with minimal parameter updates.

Comments: 10 pages, 3 figures

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.17576 [cs.CV]

(or arXiv:2603.17576v2 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.17576

arXiv-issued DOI via DataCite

Submission history

From: Mohammad Robaitul Islam Bhuiyan [view email] [v1] Wed, 18 Mar 2026 10:33:32 UTC (1,026 KB) [v2] Fri, 27 Mar 2026 14:59:18 UTC (1,026 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
LoGSAM: Par…researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 204 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers