Research Papers research paper arxiv computer-vision image-recognition

Hg-I2P: Bridging Modalities for Generalizable Image-to-Point-Cloud Registration via Heterogeneous Graphs

arXivMarch 31, 20262 min read0 views

arXiv:2603.27969v1 Announce Type: new Abstract: Image-to-point-cloud (I2P) registration aims to align 2D images with 3D point clouds by establishing reliable 2D-3D correspondences. The drastic modality gap between images and point clouds makes it challenging to learn features that are both discriminative and generalizable, leading to severe performance drops in unseen scenarios. We address this challenge by introducing a heterogeneous graph that enables refining both cross-modal features and correspondences within a unified architecture. The proposed graph represents a mapping between segmente — Pei An, Junfeng Ding, Jiaqi Yang, Yulong Wang, Jie Ma, Liangliang Nan

View PDF HTML (experimental)

Abstract:Image-to-point-cloud (I2P) registration aims to align 2D images with 3D point clouds by establishing reliable 2D-3D correspondences. The drastic modality gap between images and point clouds makes it challenging to learn features that are both discriminative and generalizable, leading to severe performance drops in unseen scenarios. We address this challenge by introducing a heterogeneous graph that enables refining both cross-modal features and correspondences within a unified architecture. The proposed graph represents a mapping between segmented 2D and 3D regions, which enhances cross-modal feature interaction and thus improves feature discriminability. In addition, modeling the consistency among vertices and edges within the graph enables pruning of unreliable correspondences. Building on these insights, we propose a heterogeneous graph embedded I2P registration method, termed Hg-I2P. It learns a heterogeneous graph by mining multi-path feature relationships, adapts features under the guidance of heterogeneous edges, and prunes correspondences using graph-based projection consistency. Experiments on six indoor and outdoor benchmarks under cross-domain setups demonstrate that Hg-I2P significantly outperforms existing methods in both generalization and accuracy. Code is released on this https URL.

Comments: Accepted to CVPR 2026

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

MSC classes: None

Cite as: arXiv:2603.27969 [cs.CV]

(or arXiv:2603.27969v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.27969

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Pei An [view email] [v1] Mon, 30 Mar 2026 02:45:27 UTC (30,161 KB)

Original source

arXiv

https://arxiv.org/abs/2603.27969

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

ProductsRecent

Open Banking to power business growth

Open Banking will be extended to business banking channels, opening the door to a broad range of tools and services for businesses, Commerce and Consumer Affairs Minister Scott Simpson and Small Business and Manufacturing Minister Chris Penk say. It means businesses can share their banking data with trusted providers, unlocking faster loan comparisons, automated accounting, and smarter cashflow tools to boost competition and productivity. “This is about making life easier for businesses. It means fintechs can develop new tools for businesses which can mean less time on paperwork and admin, and more time focusing on customers and growth,” Mr Simpson says. “Simple things like automated accounting tools and streamlined payment systems can save businesses hours every day.” In the United Kingdo

NZ Beehive All Updates

3m1 day ago

ProductsFresh

Microsoft Copilot Is Using a Surprising AI Trick to Create More Accurate Research Reports - inc.com

Microsoft Copilot Is Using a Surprising AI Trick to Create More Accurate Research Reports inc.com

GNews AI Copilot

1mabout 9 hours ago

ModelsRecent

ADeLe: Predicting and explaining AI performance across tasks

AI benchmarks report how large language models (LLMs) perform on specific tasks but provide little insight into their underlying capabilities that drive their performance. They do not explain failures or reliably predict outcomes on new tasks. To address this, Microsoft researchers in collaboration with Princeton University and Universitat Politècnica de València introduce ADeLe (opens in new tab) (AI [ ] The post ADeLe: Predicting and explaining AI performance across tasks appeared first on Microsoft Research .

Microsoft Research

1mabout 17 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 168 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersRecent

Artificial intelligence assisted colorectal lesion detection in private practices a randomized controlled study

npj Digital Medicine, Published online: 01 April 2026; doi:10.1038/s41746-026-02576-8 Artificial intelligence assisted colorectal lesion detection in private practices a randomized controlled study

nature.com

1m1 day ago

Research PapersFresh

Robust Multidimensional Chinese Remainder Theorem (MD-CRT) with Non-Diagonal Moduli and Multi-Stage Framework

arXiv:2604.00995v1 Announce Type: new Abstract: The Chinese remainder theorem (CRT) provides an efficient way to reconstruct an integer from its remainders modulo several integer moduli, and has been widely applied in signal processing and information theory. Its multidimensional extension (MD-CRT) generalizes this principle to integer vectors and integer matrix moduli, enabling reconstruction in multidimensional signal processing scenarios. However, since matrices are generally non-commutative, the multidimensional extension introduces new theoretical and algorithmic challenges. When all matrix moduli are diagonal, the system is equivalent to applying the one-dimensional CRT independently along each dimension. This work first investigates whether non-diagonal (non-separable) moduli offer

arXiv eess.SP

2mabout 5 hours ago

Research PapersFresh

Spatial Upper Bound of Radiated Power in Active Antenna Systems

arXiv:2604.00846v1 Announce Type: new Abstract: The assessment of unwanted radiated emissions from Active Antenna Systems (AAS) has become a critical issue in adjacent-band coexistence scenarios. In this paper, we establish the existence of a deterministic spatial upper bound on the radiated power of active antenna arrays. We show that the maximum radiated power always occurs in the boresight direction, irrespective of frequency or signal nature (useful signal, nonlinear distortion, or noise), or instantaneous beamforming configuration, thereby defining a conservative spatial upper bound whose angular envelope is solely determined by the elementary radiating building block of the antenna architecture, i.e., the element or sub-array radiation pattern. Starting from a two-element array with

arXiv eess.SP

2mabout 5 hours ago

Research PapersFresh

Learning Laplacian Forms for Graph Signal Processing via the Deformed Laplacian

arXiv:2604.00728v1 Announce Type: new Abstract: Learning the graph Laplacian from observed data is one of the most investigated and fundamental tasks in Graph Signal Processing (GSP). Different variants of the Laplacian, such as the combinatorial, signless or signed Laplacians have been considered depending on the type of features to be extracted from the data. The main contribution of this paper is the introduction of a parametric Laplacian, called the deformed Laplacian, defined as a quadratic matrix polynomial that provides a parametric dictionary for graph signal processing. The deformed Laplacian can be interpreted as the generator of a parametric linear reaction-diffusion dynamics on graphs, capturing the interplay between diffusive coupling and nodal reaction effects. It is a parame

arXiv eess.SP

2mabout 5 hours ago