Live
Black Hat USAAI BusinessBlack Hat AsiaAI Business512,000 lines of leaked AI agent source code, three mapped attack paths, and the audit security leaders need now - VentureBeatGoogle News: ClaudeThe org chart is holding back your A.I. strategy. LinkedIn's top executives say it's time to let it - hcamag.comGoogle News: Generative AICash App launches ‘buy now, pay later’ feature for P2P pay transfersTechCrunchWhen the Scraper Breaks Itself: Building a Self-Healing CSS Selector Repair SystemDEV CommunityAI classes have been approved for fall 2026 semester - eccunion.comGoogle News: AISelf-Referential Generics in Kotlin: When Type Safety Requires Talking to YourselfDEV CommunitySources: Amazon is in talks to acquire Globalstar to bolster its low Earth orbit satellite business; Apple's 20% stake in Globalstar is a complicating factor (Financial Times)TechmemeCalifornia Governor’s Order Targets GenAI Procurement - govtech.comGoogle News: Generative AIZ.ai Launches GLM-5V-Turbo: A Native Multimodal Vision Coding Model Optimized for OpenClaw and High-Capacity Agentic Engineering Workflows EverywhereMarkTechPostHow I Started Using AI Agents for End-to-End Testing (Autonoma AI)DEV CommunityHow AI Is Changing PTSD Recovery — And Why It MattersDEV CommunityYour Company’s AI Isn’t Broken. Your Data Just Doesn’t Know What It Means.Towards AIBlack Hat USAAI BusinessBlack Hat AsiaAI Business512,000 lines of leaked AI agent source code, three mapped attack paths, and the audit security leaders need now - VentureBeatGoogle News: ClaudeThe org chart is holding back your A.I. strategy. LinkedIn's top executives say it's time to let it - hcamag.comGoogle News: Generative AICash App launches ‘buy now, pay later’ feature for P2P pay transfersTechCrunchWhen the Scraper Breaks Itself: Building a Self-Healing CSS Selector Repair SystemDEV CommunityAI classes have been approved for fall 2026 semester - eccunion.comGoogle News: AISelf-Referential Generics in Kotlin: When Type Safety Requires Talking to YourselfDEV CommunitySources: Amazon is in talks to acquire Globalstar to bolster its low Earth orbit satellite business; Apple's 20% stake in Globalstar is a complicating factor (Financial Times)TechmemeCalifornia Governor’s Order Targets GenAI Procurement - govtech.comGoogle News: Generative AIZ.ai Launches GLM-5V-Turbo: A Native Multimodal Vision Coding Model Optimized for OpenClaw and High-Capacity Agentic Engineering Workflows EverywhereMarkTechPostHow I Started Using AI Agents for End-to-End Testing (Autonoma AI)DEV CommunityHow AI Is Changing PTSD Recovery — And Why It MattersDEV CommunityYour Company’s AI Isn’t Broken. Your Data Just Doesn’t Know What It Means.Towards AI

SeGPruner: Semantic-Geometric Visual Token Pruner for 3D Question Answering

HuggingFace PapersMarch 31, 20262 min read1 views
Source Quiz

A semantic-aware and geometry-guided token pruning framework is presented for efficient 3D question answering with multi-view images, achieving significant reductions in token budget and inference latency while maintaining competitive performance. (0 upvotes on HuggingFace)

Published on Mar 31

Authors:

,

,

,

,

Abstract

A semantic-aware and geometry-guided token pruning framework is presented for efficient 3D question answering with multi-view images, achieving significant reductions in token budget and inference latency while maintaining competitive performance.

AI-generated summary

Vision-language models (VLMs) have been widely adopted for 3D question answering (3D QA). In typical pipelines, visual tokens extracted from multiple viewpoints are concatenated with language tokens and jointly processed by a large language model (LLM) for inference. However, aggregating multi-view observations inevitably introduces severe token redundancy, leading to an overly large visual token set that significantly hinders inference efficiency under constrained token budgets. Visual token pruning has emerged as a prevalent strategy to address this issue. Nevertheless, most existing pruners are primarily tailored to 2D inputs or rely on indirect geometric cues, which limits their ability to explicitly retain semantically critical objects and maintain sufficient spatial coverage for robust 3D reasoning. In this paper, we propose SeGPruner, a semantic-aware and geometry-guided token reduction framework for efficient 3D QA with multi-view images. Specifically, SeGPruner first preserves semantically salient tokens through an attention-based importance module (Saliency-aware Token Selector), ensuring that object-critical evidence is retained. It then complements these tokens with spatially diverse ones via a geometry-guided selector (Geometry-aware Token Diversifier), which jointly considers semantic relevance and 3D geometric distance. This cooperation between saliency preservation and geometry-guided diversification balances object-level evidence and global scene coverage under aggressive token reduction. Extensive experiments on ScanQA and OpenEQA demonstrate that SeGPruner substantially improves inference efficiency, reducing the visual token budget by 91% and inference latency by 86%, while maintaining competitive performance in 3D reasoning tasks.

View arXiv page View PDF Project page GitHub 0 Add to collection

Get this paper in your agent:

hf papers read 2603.29437

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.29437 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.29437 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.29437 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
SeGPruner: …researchpaperarxivvision-lang…3D question…visual toke…HuggingFace…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 189 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers