Research Papers research paper arxiv computer-vision image-recognition

MM-OVSeg:Multimodal Optical-SAR Fusion for Open-Vocabulary Segmentation in Remote Sensing

arXivby [Submitted on 18 Mar 2026 (v1), last revised 27 Mar 2026 (this version, v2)]March 30, 20262 min read2 views

arXiv:2603.17528v2 Announce Type: replace Abstract: Open-vocabulary segmentation enables pixel-level recognition from an open set of textual categories, allowing generalization beyond fixed classes. Despite great potential in remote sensing, progress in this area remains largely limited to clear-sky optical data and struggles under cloudy or haze-contaminated conditions. We present MM-OVSeg, a multimodal Optical-SAR fusion framework for resilient open-vocabulary segmentation under adverse weather conditions. MM-OVSeg leverages the complementary strengths of the two modalities--optical imagery — Yimin Wei, Aoran Xiao, Hongruixuan Chen, Junshi Xia, Naoto Yokoya

View PDF HTML (experimental)

Abstract:Open-vocabulary segmentation enables pixel-level recognition from an open set of textual categories, allowing generalization beyond fixed classes. Despite great potential in remote sensing, progress in this area remains largely limited to clear-sky optical data and struggles under cloudy or haze-contaminated conditions. We present MM-OVSeg, a multimodal Optical-SAR fusion framework for resilient open-vocabulary segmentation under adverse weather conditions. MM-OVSeg leverages the complementary strengths of the two modalities--optical imagery provides rich spectral semantics, while synthetic aperture radar (SAR) offers cloud-penetrating structural cues. To address the cross-modal domain gap and the limited dense prediction capability of current vision-language models, we propose two key designs: a cross-modal unification process for multi-sensor representation alignment, and a dual-encoder fusion module that integrates hierarchical features from multiple vision foundation models for text-aligned multimodal segmentation. Extensive experiments demonstrate that MM-OVSeg achieves superior robustness and generalization across diverse cloud conditions. The source dataset and code are available at this https URL.

Comments: CVPR2026

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.17528 [cs.CV]

(or arXiv:2603.17528v2 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.17528

arXiv-issued DOI via DataCite

Submission history

From: YiMin Wei [view email] [v1] Wed, 18 Mar 2026 09:34:23 UTC (2,631 KB) [v2] Fri, 27 Mar 2026 14:52:22 UTC (3,129 KB)

Original source

arXiv

https://arxiv.org/abs/2603.17528

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Research PapersFresh

NIH funds AI project to advance Alzheimer’s research and treatment - News-Medical

NIH funds AI project to advance Alzheimer’s research and treatment News-Medical

GNews AI drug discovery

1mabout 9 hours ago

Models

What is next in reinforcement learning for LLMs?

Reinforcement learning from verifiable rewards (RLVR) ushered in a new generation of reasoning models. Now, researchers are looking beyond RLVR to create the next breakthrough in AI. The post What is next in reinforcement learning for LLMs? first appeared on TechTalks .

TechTalks

1m4 months ago

ProductsFresh

How Are UK Adults Spending Their Time Online?

New research from Ofcom reveals how people in the UK use, understand and feel about the media and online services they interact with in their daily lives. The regulator s annual Adults’ Media Use and Attitudes and Adults’ Media Lives research reports tracked trends in the nation’s media habits and online behaviours over the last year. [ ] The post How Are UK Adults Spending Their Time Online? appeared first on DIGIT .

Digit.fyi

1mabout 2 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 172 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research PapersFresh

NIH funds AI project to advance Alzheimer’s research and treatment - News-Medical

NIH funds AI project to advance Alzheimer’s research and treatment News-Medical

GNews AI drug discovery

1mabout 9 hours ago

Research PapersFresh

Beyond Metadata: Multimodal, Policy-Aware Detection of YouTube Scam Videos

arXiv:2509.23418v2 Announce Type: replace Abstract: YouTube is a major platform for information and entertainment, but its wide accessibility also makes it attractive for scammers to upload deceptive or malicious content. Prior detection approaches rely largely on textual or statistical metadata, such as titles, descriptions, view counts, or likes, which are effective in many cases but can be evaded through benign-looking text, manipulated statistics, or other obfuscation strategies (e.g., 'Leetspeak'), while ignoring visual cues. In this study, we systematically investigate multimodal approaches for detecting YouTube scams. Our dataset consolidates established scam categories and augments them with full-length videos and policy-grounded reasoning annotations. Experiments show that a text-

arXiv cs.CR

2mabout 8 hours ago

Research PapersFresh

Online Flow Time Minimization: Tight Bounds for Non-Preemptive Algorithms

arXiv:2511.03485v3 Announce Type: replace Abstract: This paper studies the online scheduling problem of minimizing total flow time for $n$ jobs on $m$ identical machines. A classical $\Omega(n)$ lower bound shows that no deterministic single-machine algorithm can beat the trivial greedy, even when $n$ is known in advance. However, this barrier is specific to deterministic algorithms on a single machine, leaving open what randomization, multiple machines, or the kill-and-restart capability can achieve. We give a nearly complete answer. For randomized non-preemptive algorithms, we establish a tight $\Theta(\sqrt{n/m})$ competitive ratio, which also improves the best offline approximation to $O(\sqrt{n/m})$. For deterministic non-preemptive algorithms on multiple machines, we prove an $O(n/m^

arXiv cs.DS

2mabout 8 hours ago

Research PapersFresh

On the average-case complexity landscape for Tensor-Isomorphism-complete problems over finite fields

arXiv:2604.00591v1 Announce Type: cross Abstract: In Grochow and Qiao (SIAM J. Comput., 2021), the complexity class Tensor Isomorphism (TI) was introduced and isomorphism problems for groups, algebras, and polynomials were shown to be TI-complete. In this paper, we study average-case algorithms for several TI-complete problems over finite fields, including algebra isomorphism, matrix code conjugacy, and $4$-tensor isomorphism. Our main results are as follows. Over the finite field of order $q$, we devise (1) average-case polynomial-time algorithms for algebra isomorphism and matrix code conjugacy that succeed in a $1/\Theta(q)$ fraction of inputs and (2) an average-case polynomial-time algorithm for the $4$-tensor isomorphism that succeeds in a $1/q^{\Theta(1)}$ fraction of inputs. Prior t

arXiv cs.DS

2mabout 8 hours ago