Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessIs Scale AI Stock Public in 2026? Price, Symbol & Alternatives - Bullish BearsGoogle News - Scale AI dataHow to Choose Your MVP Tech StackDEV CommunityDocument Workflow Automation: An Architectural Guide to Building API-Driven Document PipelinesDEV CommunityHow to Roll Back a Failed Deployment in 30 SecondsDEV CommunityWho's hiring — April 2026DEV CommunityScraped 300 pages successfully. Site updated robots.txt at page 187 and blocked me.DEV CommunityI built an npm malware scanner in Rust because npm audit isn't enoughDEV CommunityMCP App CSP Explained: Why Your Widget Won't RenderDEV CommunityVS-wet dreigt ASML-export van immersiemachines naar China af te knijpenTweakers.netBuilt a script to categorize expenses automatically. Saved 3 hours/month.DEV CommunityFrom MLOps to LLMOps: A Practical AWS GenAI Operations GuideDEV CommunityCleaned 10k customer records. One emoji crashed my entire pipeline.DEV CommunityBlack Hat USADark ReadingBlack Hat AsiaAI BusinessIs Scale AI Stock Public in 2026? Price, Symbol & Alternatives - Bullish BearsGoogle News - Scale AI dataHow to Choose Your MVP Tech StackDEV CommunityDocument Workflow Automation: An Architectural Guide to Building API-Driven Document PipelinesDEV CommunityHow to Roll Back a Failed Deployment in 30 SecondsDEV CommunityWho's hiring — April 2026DEV CommunityScraped 300 pages successfully. Site updated robots.txt at page 187 and blocked me.DEV CommunityI built an npm malware scanner in Rust because npm audit isn't enoughDEV CommunityMCP App CSP Explained: Why Your Widget Won't RenderDEV CommunityVS-wet dreigt ASML-export van immersiemachines naar China af te knijpenTweakers.netBuilt a script to categorize expenses automatically. Saved 3 hours/month.DEV CommunityFrom MLOps to LLMOps: A Practical AWS GenAI Operations GuideDEV CommunityCleaned 10k customer records. One emoji crashed my entire pipeline.DEV Community
AI NEWS HUBbyEIGENVECTOREigenvector

IP-SAM: Prompt-Space Conditioning for Prompt-Absent Camouflaged Object Detection

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2603.27250v1 Announce Type: new Abstract: Prompt-conditioned foundation segmenters have emerged as a dominant paradigm for image segmentation, where explicit spatial prompts (e.g., points, boxes, masks) guide mask decoding. However, many real-world deployments require fully automatic segmentation, creating a structural mismatch: the decoder expects prompts that are unavailable at inference. Existing adaptations typically modify intermediate features, inadvertently bypassing the model's native prompt interface and weakening prompt-conditioned decoding. We propose IP-SAM, which revisits ad — Huiyao Zhang, Jin Bai, Rui Guo, JianWen Tan, HongFei Wang, Ye Li

View PDF HTML (experimental)

Abstract:Prompt-conditioned foundation segmenters have emerged as a dominant paradigm for image segmentation, where explicit spatial prompts (e.g., points, boxes, masks) guide mask decoding. However, many real-world deployments require fully automatic segmentation, creating a structural mismatch: the decoder expects prompts that are unavailable at inference. Existing adaptations typically modify intermediate features, inadvertently bypassing the model's native prompt interface and weakening prompt-conditioned decoding. We propose IP-SAM, which revisits adaptation from a prompt-space perspective through prompt-space conditioning. Specifically, a Self-Prompt Generator (SPG) distills image context into complementary intrinsic prompts that serve as coarse regional anchors. These cues are projected through SAM2's frozen prompt encoder, restoring prompt-guided decoding without external intervention. To suppress background-induced false positives, Prompt-Space Gating (PSG) leverages the intrinsic background prompt as an asymmetric suppressive constraint prior to decoding. Under a deterministic no-external-prompt protocol, IP-SAM achieves state-of-the-art performance across four camouflaged object detection benchmarks (e.g., MAE 0.017 on COD10K) with only 21.26M trainable parameters (optimizing SPG, PSG, and a task-specific mask decoder trained from scratch, alongside image-encoder LoRA while keeping the prompt encoder frozen). Furthermore, the proposed conditioning strategy generalizes beyond COD to medical polyp segmentation, where a model trained solely on Kvasir-SEG exhibits strong zero-shot transfer to both CVC-ClinicDB and ETIS.

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.27250 [cs.CV]

(or arXiv:2603.27250v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.27250

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Huiyao Zhang [view email] [v1] Sat, 28 Mar 2026 11:52:55 UTC (2,409 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
IP-SAM: Pro…researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 143 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!