Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessCommunity Without Tokens: What AI Dev Tools Can Learn from Crypto's Community PlaybookDev.to AIGarry Tan's gstack: Install This 56k-Star 'Virtual Team' for Claude CodeDev.to AIA Step-by-Step Guide to K-Nearest Neighbors (KNN) in Machine LearningDev.to AIOil prices extend gains after record monthly rally as Iran war fuels supply worriesCNBC TechnologyThis Isn’t Another ‘AI Productivity Hack’ ArticleMedium AIThe Understanding Problem Of The FutureMedium AIBuilding a Neural Network in Rust: A Step-by-Step GuideMedium AIMercor says it was hit by cyberattack tied to compromise of open-source LiteLLM projectTechCrunch AIHow AI has suddenly become much more useful to open-source developers - ZDNETGNews AI open sourceIn the Iran war, it looks like AI helped with operations, not strategyGary Marcus BlogGoogle adds AI charging guidance to Maps for EV drivers - mezha.netGoogle News - AI UkraineDespite its $350 billion investment promise in the U.S., the U.S. has unprecedentedly raised trade p.. - 매일경제GNews AI KoreaBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessCommunity Without Tokens: What AI Dev Tools Can Learn from Crypto's Community PlaybookDev.to AIGarry Tan's gstack: Install This 56k-Star 'Virtual Team' for Claude CodeDev.to AIA Step-by-Step Guide to K-Nearest Neighbors (KNN) in Machine LearningDev.to AIOil prices extend gains after record monthly rally as Iran war fuels supply worriesCNBC TechnologyThis Isn’t Another ‘AI Productivity Hack’ ArticleMedium AIThe Understanding Problem Of The FutureMedium AIBuilding a Neural Network in Rust: A Step-by-Step GuideMedium AIMercor says it was hit by cyberattack tied to compromise of open-source LiteLLM projectTechCrunch AIHow AI has suddenly become much more useful to open-source developers - ZDNETGNews AI open sourceIn the Iran war, it looks like AI helped with operations, not strategyGary Marcus BlogGoogle adds AI charging guidance to Maps for EV drivers - mezha.netGoogle News - AI UkraineDespite its $350 billion investment promise in the U.S., the U.S. has unprecedentedly raised trade p.. - 매일경제GNews AI Korea

Robust Remote Sensing Image-Text Retrieval with Noisy Correspondence

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2603.28134v1 Announce Type: new Abstract: As a pivotal task that bridges remote visual and linguistic understanding, Remote Sensing Image-Text Retrieval (RSITR) has attracted considerable research interest in recent years. However, almost all RSITR methods implicitly assume that image-text pairs are matched perfectly. In practice, acquiring a large set of well-aligned data pairs is often prohibitively expensive or even infeasible. In addition, we also notice that the remote sensing datasets (e.g., RSITMD) truly contain some inaccurate or mismatched image text descriptions. Based on the a — Qiya Song, Yiqiang Xie, Yuan Sun, Renwei Dian, Xudong Kang

View PDF HTML (experimental)

Abstract:As a pivotal task that bridges remote visual and linguistic understanding, Remote Sensing Image-Text Retrieval (RSITR) has attracted considerable research interest in recent years. However, almost all RSITR methods implicitly assume that image-text pairs are matched perfectly. In practice, acquiring a large set of well-aligned data pairs is often prohibitively expensive or even infeasible. In addition, we also notice that the remote sensing datasets (e.g., RSITMD) truly contain some inaccurate or mismatched image text descriptions. Based on the above observations, we reveal an important but untouched problem in RSITR, i.e., Noisy Correspondence (NC). To overcome these challenges, we propose a novel Robust Remote Sensing Image-Text Retrieval (RRSITR) paradigm that designs a self-paced learning strategy to mimic human cognitive learning patterns, thereby learning from easy to hard from multi-modal data with NC. Specifically, we first divide all training sample pairs into three categories based on the loss magnitude of each pair, i.e., clean sample pairs, ambiguous sample pairs, and noisy sample pairs. Then, we respectively estimate the reliability of each training pair by assigning a weight to each pair based on the values of the loss. Further, we respectively design a new multi-modal self-paced function to dynamically regulate the training sequence and weights of the samples, thus establishing a progressive learning process. Finally, for noisy sample pairs, we present a robust triplet loss to dynamically adjust the soft margin based on semantic similarity, thereby enhancing the robustness against noise. Extensive experiments on three popular benchmark datasets demonstrate that the proposed RRSITR significantly outperforms the state-of-the-art methods, especially in high noise rates. The code is available at: this https URL

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.28134 [cs.CV]

(or arXiv:2603.28134v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.28134

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Song Qiya [view email] [v1] Mon, 30 Mar 2026 07:55:07 UTC (822 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Robust Remo…researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 84 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers