SAM 3: Segment Anything with Concepts
arXiv:2511.16719v2 Announce Type: replace-cross Abstract: We present Segment Anything Model (SAM) 3, a unified model that detects, segments, and tracks objects in images and videos based on concept prompts, which we define as either short noun phrases (e.g., "yellow school bus"), image exemplars, or a combination of both. Promptable Concept Segmentation (PCS) takes such prompts and returns segmentation masks and unique identities for all matching object instances. To advance PCS, we build a scalable data engine that produces a high-quality dataset with 4M unique concept labels, including hard — Nicolas Carion, Laura Gustafson, Yuan-Ting Hu, Shoubhik Debnath, Ronghang Hu, Didac Suris, Chaitanya Ryali, Kalyan Vasudev Alwala, Haitham Khedr, Andrew Huang, Jie Lei, Tengyu Ma, Baishan Guo, Arpit Kalla, Markus Marks, Joseph Greer, Meng Wang, Peize Sun, Roman R\"adle, Triantafyllos Afouras, Effrosyni Mavroudi, Katherine Xu, Tsung-Han Wu, Yu Zhou, Liliane Momeni, Rishi Hazra, Shuangrui Ding, Sagar Vaze, Francois Porcher, Feng Li, Siyuan Li, Aishwarya Kamath, Ho Kei Cheng, Piotr Doll\'ar, Nikhila Ravi, Kate Saenko, Pengchuan Zhang, Christoph Feichtenhofer
Authors:Nicolas Carion, Laura Gustafson, Yuan-Ting Hu, Shoubhik Debnath, Ronghang Hu, Didac Suris, Chaitanya Ryali, Kalyan Vasudev Alwala, Haitham Khedr, Andrew Huang, Jie Lei, Tengyu Ma, Baishan Guo, Arpit Kalla, Markus Marks, Joseph Greer, Meng Wang, Peize Sun, Roman Rädle, Triantafyllos Afouras, Effrosyni Mavroudi, Katherine Xu, Tsung-Han Wu, Yu Zhou, Liliane Momeni, Rishi Hazra, Shuangrui Ding, Sagar Vaze, Francois Porcher, Feng Li, Siyuan Li, Aishwarya Kamath, Ho Kei Cheng, Piotr Dollár, Nikhila Ravi, Kate Saenko, Pengchuan Zhang, Christoph Feichtenhofer
View PDF HTML (experimental)
Abstract:We present Segment Anything Model (SAM) 3, a unified model that detects, segments, and tracks objects in images and videos based on concept prompts, which we define as either short noun phrases (e.g., "yellow school bus"), image exemplars, or a combination of both. Promptable Concept Segmentation (PCS) takes such prompts and returns segmentation masks and unique identities for all matching object instances. To advance PCS, we build a scalable data engine that produces a high-quality dataset with 4M unique concept labels, including hard negatives, across images and videos. Our model consists of an image-level detector and a memory-based video tracker that share a single backbone. Recognition and localization are decoupled with a presence head, which boosts detection accuracy. SAM 3 doubles the accuracy of existing systems in both image and video PCS, and improves previous SAM capabilities on visual segmentation tasks. We open source SAM 3 along with our new Segment Anything with Concepts (SA-Co) benchmark for promptable concept segmentation.
Subjects:
Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as: arXiv:2511.16719 [cs.CV]
(or arXiv:2511.16719v2 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2511.16719
arXiv-issued DOI via DataCite
Submission history
From: Christoph Feichtenhofer [view email] [v1] Thu, 20 Nov 2025 18:59:56 UTC (37,393 KB) [v2] Sat, 28 Mar 2026 16:54:56 UTC (37,496 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxiv
"Be Anything You Want" — OK, Here's How (Technically)
This is a submission for the DEV April Fools Challenge What I Built "I Want To Be..." is a life advice generator that takes your dreams and fulfills them — literally. Want to be rich? Change your name to Richard. Want to be a ninja? Wear all black and move slightly too quietly. People will get the idea. Want to be a cat? Knock something off a table and maintain eye contact. Cat energy. It's a genie who passed the bar exam for loopholes. You asked, we delivered. Technically. 44 categories of deadpan, literally-correct life hacks — from "astronaut" to "wizard" to "left alone" — plus 24 universal fallback answers for the truly original dreamers. Every answer is technically true. None of them are helpful. All of them are stamped 100% LEGIT ADVICE . Demo Try it live on GitHub Pages Type in your
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers

This Wi-Fi receiver can work inside a nuclear reactor, keeping robots connected
The research, presented at the IEEE International Solid-State Circuits Conference in San Francisco earlier this year, shows the receiver can continue operating after exposure to 500 kilograys of radiation. That level of endurance far exceeds what even space-grade electronics are designed to handle. Read Entire Article




Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!