MELT: Improve Composed Image Retrieval via the Modification Frequentation-Rarity Balance Network
arXiv:2603.29291v1 Announce Type: new Abstract: Composed Image Retrieval (CIR) uses a reference image and a modification text as a query to retrieve a target image satisfying the requirement of ``modifying the reference image according to the text instructions''. However, existing CIR methods face two limitations: (1) frequency bias leading to ``Rare Sample Neglect'', and (2) susceptibility of similarity scores to interference from hard negative samples and noise. To address these limitations, we confront two key challenges: asymmetric rare semantic localization and robust similarity estimation under hard negative samples. To solve these challenges, we propose the Modification frEquentation-rarity baLance neTwork MELT. MELT assigns increased attention to rare modification semantics in mult
View PDF HTML (experimental)
Abstract:Composed Image Retrieval (CIR) uses a reference image and a modification text as a query to retrieve a target image satisfying the requirement of
modifying the reference image according to the text instructions''. However, existing CIR methods face two limitations: (1) frequency bias leading toRare Sample Neglect'', and (2) susceptibility of similarity scores to interference from hard negative samples and noise. To address these limitations, we confront two key challenges: asymmetric rare semantic localization and robust similarity estimation under hard negative samples. To solve these challenges, we propose the Modification frEquentation-rarity baLance neTwork MELT. MELT assigns increased attention to rare modification semantics in multimodal contexts while applying diffusion-based denoising to hard negative samples with high similarity scores, enhancing multimodal fusion and matching. Extensive experiments on two CIR benchmarks validate the superior performance of MELT. Codes are available at this https URL.
Comments: Accepted by ICASSP 2026
Subjects:
Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as: arXiv:2603.29291 [cs.CV]
(or arXiv:2603.29291v1 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2603.29291
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Zhiwei Chen [view email] [v1] Tue, 31 Mar 2026 05:52:58 UTC (2,049 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
benchmarkannounceavailableReviewing the evidence on psychological manipulation by Bots and AI
TL;DR: In terms of the potential risks and harms that can come from powerful AI models, hyper-persuasion of individuals is unlikely to be a serious threat at this point in time. I wouldn’t consider this threat path to be very easy for a misaligned AI or maliciously wielded AI to navigate reliably. I would expect that, for people hoping to reduce risks associated with AI models, there are other more impactful and tractable defenses they could work on. I would advocate for more substantive research into the effects of long-term influence from AI companions and dependency, as well as more research into what interventions may work in both one-off and chronic contexts. ----- In this post we’ll explore how bots can actually influence human psychology and decision-making, and what might be done t
[P] Trained a small BERT on 276K Kubernetes YAMLs using tree positional encoding instead of sequential
I trained a BERT-style transformer on 276K Kubernetes YAML files, replacing standard positional encoding with learned tree coordinates (depth, sibling index, node type). The model uses hybrid bigram/trigram prediction targets to learn both universal structure and kind-specific patterns — 93/93 capability tests passing. Interesting findings: learned depth embeddings are nearly orthogonal (categorical, not smooth like sine/cosine), and 28/48 attention heads specialize on same-depth attention (up to 14.5x bias). GitHub: https://github.com/vimalk78/yaml-bert submitted by /u/vimalk78 [link] [comments]
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Releases

Why the world s militaries are scrambling to create their own Starlink
The reliable internet connections provided by Starlink offer a huge advantage on the battlefield. But as access is dependent on the whims of controversial billionaire Elon Musk, militaries are looking to build their own version


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!