Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessAnthropic releases part of AI tool source code in 'error'TechXplore AIMCMC Island Hopping: An Intuitive Guide to the Metropolis-Hastings AlgorithmDEV CommunityOracle cut thousands of jobs in recent round of layoffs – CNBCSilicon RepublicAnthropic admits partial leak of Claude Code source, says no customer data exposed - Storyboard18Google News: Claude38 Commits, Zero New Features — How I Made My Web App Production-ReadyDEV CommunityHow to Make Your WooCommerce Store Discoverable by ChatGPT (And Convert That Traffic)DEV CommunityLWiAI Podcast #238 - GPT 5.4 mini, OpenAI Pivot, Mamba 3, Attention ResidualsLast Week in AIThe Leaked 'Employee-Grade' CLAUDE.md: How to Use It TodayDEV CommunityCanal+ Names Anne‑Laure Tingry Chief Data & AI Officer - The Hollywood ReporterGoogle News: AILouisiana scraps some, but not all, AI proposals after Trump threats - Louisiana IlluminatorGoogle News: AIAnthropic accidentally leaks Claude Code source in npm slipSilicon RepublicChina’s AI Is Spreading Fast. Here’s How to Stop the Security Risks - War on the RocksGoogle News: AI SafetyBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessAnthropic releases part of AI tool source code in 'error'TechXplore AIMCMC Island Hopping: An Intuitive Guide to the Metropolis-Hastings AlgorithmDEV CommunityOracle cut thousands of jobs in recent round of layoffs – CNBCSilicon RepublicAnthropic admits partial leak of Claude Code source, says no customer data exposed - Storyboard18Google News: Claude38 Commits, Zero New Features — How I Made My Web App Production-ReadyDEV CommunityHow to Make Your WooCommerce Store Discoverable by ChatGPT (And Convert That Traffic)DEV CommunityLWiAI Podcast #238 - GPT 5.4 mini, OpenAI Pivot, Mamba 3, Attention ResidualsLast Week in AIThe Leaked 'Employee-Grade' CLAUDE.md: How to Use It TodayDEV CommunityCanal+ Names Anne‑Laure Tingry Chief Data & AI Officer - The Hollywood ReporterGoogle News: AILouisiana scraps some, but not all, AI proposals after Trump threats - Louisiana IlluminatorGoogle News: AIAnthropic accidentally leaks Claude Code source in npm slipSilicon RepublicChina’s AI Is Spreading Fast. Here’s How to Stop the Security Risks - War on the RocksGoogle News: AI Safety

GenHOI: Generalized Hand-Object Pose Estimation with Occlusion Awareness

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2603.19013v3 Announce Type: replace Abstract: Generalized 3D hand-object pose estimation from a single RGB image remains challenging due to the large variations in object appearances and interaction patterns, especially under heavy occlusion. We propose GenHOI, a framework for generalized hand-object pose estimation with occlusion awareness. GenHOI integrates hierarchical semantic knowledge with hand priors to enhance model generalization under challenging occlusion conditions. Specifically, we introduce a hierarchical semantic prompt that encodes object states, hand configurations, and — Hui Yang, Wei Sun, Jian Liu, Jian Xiao, Tao Xie, Hossein Rahmani, Ajmal Saeed Mian, Nicu Sebe, Gim Hee Lee

View PDF HTML (experimental)

Abstract:Generalized 3D hand-object pose estimation from a single RGB image remains challenging due to the large variations in object appearances and interaction patterns, especially under heavy occlusion. We propose GenHOI, a framework for generalized hand-object pose estimation with occlusion awareness. GenHOI integrates hierarchical semantic knowledge with hand priors to enhance model generalization under challenging occlusion conditions. Specifically, we introduce a hierarchical semantic prompt that encodes object states, hand configurations, and interaction patterns via textual descriptions. This enables the model to learn abstract high-level representations of hand-object interactions for generalization to unseen objects and novel interactions while compensating for missing or ambiguous visual cues. To enable robust occlusion reasoning, we adopt a multi-modal masked modeling strategy over RGB images, predicted point clouds, and textual descriptions. Moreover, we leverage hand priors as stable spatial references to extract implicit interaction constraints. This allows reliable pose inference even under significant variations in object shapes and interaction patterns. Extensive experiments on the challenging DexYCB and HO3Dv2 benchmarks demonstrate that our method achieves state-of-the-art performance in hand-object pose estimation.

Comments: 25 pages, 7 figures

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.19013 [cs.CV]

(or arXiv:2603.19013v3 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.19013

arXiv-issued DOI via DataCite

Submission history

From: Hui Yang [view email] [v1] Thu, 19 Mar 2026 15:19:23 UTC (1,273 KB) [v2] Fri, 20 Mar 2026 10:37:04 UTC (1,273 KB) [v3] Sun, 29 Mar 2026 05:50:06 UTC (1,273 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
GenHOI: Gen…researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 182 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers