Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessAnthropic Source Code Leak: What Was Exposed & Why It Matters in AI Security - iZOOlogicGoogle News: ClaudeSome editors 'uploading confidential manuscripts to ChatGPT to read quickly', agent claims - The BooksellerGoogle News: ChatGPTChatGPT rolls out on Apple CarPlay with voice-only interaction support - Storyboard18Google News: ChatGPTOpenAI raises $122 billion at $852 billion valuation, closing largest funding round in history - Yahoo FinanceGoogle News: OpenAI🔥 yusufkaraaslan/Skill_SeekersGitHub Trending🔥 Huanshere/VideoLingoGitHub Trending🔥 sansan0/TrendRadarGitHub Trending🔥 allenai/OLMo-coreGitHub Trending🔥 LMCache/LMCacheGitHub Trending🔥 microsoft/agent-frameworkGitHub Trending🔥 NVIDIA/Model-OptimizerGitHub Trending🔥 sponsors/fGitHub TrendingBlack Hat USADark ReadingBlack Hat AsiaAI BusinessAnthropic Source Code Leak: What Was Exposed & Why It Matters in AI Security - iZOOlogicGoogle News: ClaudeSome editors 'uploading confidential manuscripts to ChatGPT to read quickly', agent claims - The BooksellerGoogle News: ChatGPTChatGPT rolls out on Apple CarPlay with voice-only interaction support - Storyboard18Google News: ChatGPTOpenAI raises $122 billion at $852 billion valuation, closing largest funding round in history - Yahoo FinanceGoogle News: OpenAI🔥 yusufkaraaslan/Skill_SeekersGitHub Trending🔥 Huanshere/VideoLingoGitHub Trending🔥 sansan0/TrendRadarGitHub Trending🔥 allenai/OLMo-coreGitHub Trending🔥 LMCache/LMCacheGitHub Trending🔥 microsoft/agent-frameworkGitHub Trending🔥 NVIDIA/Model-OptimizerGitHub Trending🔥 sponsors/fGitHub Trending

Integrating Multimodal Large Language Model Knowledge into Amodal Completion

arXivMarch 31, 202610 min read0 views
Source Quiz

arXiv:2603.28333v1 Announce Type: cross Abstract: With the widespread adoption of autonomous vehicles and robotics, amodal completion, which reconstructs the occluded parts of people and objects in an image, has become increasingly crucial. Just as humans infer hidden regions based on prior experience and common sense, this task inherently requires physical knowledge about real-world entities. However, existing approaches either depend solely on the image generation ability of visual generative models, which lack such knowledge, or leverage it only during the segmentation stage, preventing it — Heecheol Yun, Eunho Yang

View PDF HTML (experimental)

Abstract:With the widespread adoption of autonomous vehicles and robotics, amodal completion, which reconstructs the occluded parts of people and objects in an image, has become increasingly crucial. Just as humans infer hidden regions based on prior experience and common sense, this task inherently requires physical knowledge about real-world entities. However, existing approaches either depend solely on the image generation ability of visual generative models, which lack such knowledge, or leverage it only during the segmentation stage, preventing it from explicitly guiding the completion process. To address this, we propose AmodalCG, a novel framework that harnesses the real-world knowledge of Multimodal Large Language Models (MLLMs) to guide amodal completion. Our framework first assesses the extent of occlusion to selectively invoke MLLM guidance only when the target object is heavily occluded. If guidance is required, the framework further incorporates MLLMs to reason about both the (1) extent and (2) content of the missing regions. Finally, a visual generative model integrates these guidance and iteratively refines imperfect completions that may arise from inaccurate MLLM guidance. Experimental results on various real-world images show impressive improvements compared to all existing works, suggesting MLLMs as a promising direction for addressing challenging amodal completion.

Subjects:

Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Cite as: arXiv:2603.28333 [cs.CV]

(or arXiv:2603.28333v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.28333

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Heecheol Yun [view email] [v1] Mon, 30 Mar 2026 12:03:47 UTC (12,164 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Integrating…researchpaperarxivaiartificial-…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 233 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers