Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessI Brute-Forced 2 Million Hashes to Get a Shiny Legendary Cat in My Terminal. It Has Max SNARK and a Propeller Hat.DEV CommunityHave to do enough for my talk, "Is AI Getting Reports Wrong? Try Google LookML, Your Data Dictionary!" at Google NEXT 2026DEV CommunityTaming the Ingredient Sourcing Nightmare with AI AutomationDEV Community# 🚀 How to Build a High-Performance Landing Page with Next.js 15 and Tailwind v4DEV CommunityClaude Code Architecture Explained: Agent Loop, Tool System, and Permission Model (Rust Rewrite Analysis)DEV CommunityThe Data Structure That's Okay With Being WrongDEV CommunityHow to Auto-Index Your URLs with Google Search Console APIDEV CommunityThe Indestructible FutureLessWrong AIBuilding Real-Time Features in React Without WebSocket LibrariesDEV CommunityChatGPT Maker OpenAI Valued at $852B After Record $122B Funding Round - Bitcoin.com NewsGoogle News: ChatGPTParameter Count Is the Worst Way to Pick a Model on 8GB VRAMDEV CommunityTreeline, which is building an AI and software-first alternative to legacy corporate IT systems, raised a $25M Series A led by Andreessen Horowitz (Lily Mae Lazarus/Fortune)TechmemeBlack Hat USADark ReadingBlack Hat AsiaAI BusinessI Brute-Forced 2 Million Hashes to Get a Shiny Legendary Cat in My Terminal. It Has Max SNARK and a Propeller Hat.DEV CommunityHave to do enough for my talk, "Is AI Getting Reports Wrong? Try Google LookML, Your Data Dictionary!" at Google NEXT 2026DEV CommunityTaming the Ingredient Sourcing Nightmare with AI AutomationDEV Community# 🚀 How to Build a High-Performance Landing Page with Next.js 15 and Tailwind v4DEV CommunityClaude Code Architecture Explained: Agent Loop, Tool System, and Permission Model (Rust Rewrite Analysis)DEV CommunityThe Data Structure That's Okay With Being WrongDEV CommunityHow to Auto-Index Your URLs with Google Search Console APIDEV CommunityThe Indestructible FutureLessWrong AIBuilding Real-Time Features in React Without WebSocket LibrariesDEV CommunityChatGPT Maker OpenAI Valued at $852B After Record $122B Funding Round - Bitcoin.com NewsGoogle News: ChatGPTParameter Count Is the Worst Way to Pick a Model on 8GB VRAMDEV CommunityTreeline, which is building an AI and software-first alternative to legacy corporate IT systems, raised a $25M Series A led by Andreessen Horowitz (Lily Mae Lazarus/Fortune)Techmeme

SUG-Occ: Explicit Semantics and Uncertainty Guided Sparse Learning for Efficient 3D Occupancy Prediction

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2601.11396v5 Announce Type: replace Abstract: 3D semantic occupancy prediction has emerged as a critical perception task for autonomous driving due to its ability to offer voxel-level semantic and geometric understanding of the environment. However, such a refined representation for large-scale scenes incurs prohibitive computation, posing a significant challenge to practical real-time deployment. To address this, we propose SUGOcc, an explicit semantics and uncertainty guided sparse learning framework for efficient occupancy prediction, which exploits the inherent sparsity of 3D scenes — Hanlin Wu, Pengfei Lin, Ehsan Javanmardi, Naren Bao, Bo Qian, Hao Si, Manabu Tsukada

View PDF HTML (experimental)

Abstract:3D semantic occupancy prediction has emerged as a critical perception task for autonomous driving due to its ability to offer voxel-level semantic and geometric understanding of the environment. However, such a refined representation for large-scale scenes incurs prohibitive computation, posing a significant challenge to practical real-time deployment. To address this, we propose SUGOcc, an explicit semantics and uncertainty guided sparse learning framework for efficient occupancy prediction, which exploits the inherent sparsity of 3D scenes to reduce redundant computation while maintaining geometric and semantic integrity. Specifically, we first utilize semantic and uncertainty priors to suppress image projections from free space while employing explicit unsigned distance encoding to enhance geometric consistency, thereby producing a structurally sparse representation. Secondly, we introduce a cascade sparse completion module to enable efficient coarse-to-fine reasoning over the sparse representation via hyper cross sparse convolution, generative upsampling and adaptive pruning. Finally, we propose an object contextual representation (OCR) based mask decoder that refines the voxel-wise predictions through lightweight query-context interactions, thereby avoiding expensive attention operations over volumetric features. Extensive experiments on SemanticKITTI and Occ3D-Nuscenes benchmark demonstrate that the proposed approach outperforms the baselines, achieving notable improvements in both accuracy and efficiency across datasets.

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2601.11396 [cs.CV]

(or arXiv:2601.11396v5 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2601.11396

arXiv-issued DOI via DataCite

Submission history

From: Hanlin Wu [view email] [v1] Fri, 16 Jan 2026 16:07:38 UTC (2,642 KB) [v2] Wed, 21 Jan 2026 04:26:25 UTC (2,684 KB) [v3] Thu, 22 Jan 2026 10:43:59 UTC (2,750 KB) [v4] Mon, 9 Feb 2026 07:10:47 UTC (2,727 KB) [v5] Sat, 28 Mar 2026 13:26:16 UTC (4,072 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
SUG-Occ: Ex…researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Building knowledge graph…

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!