Minimizing the Pretraining Gap: Domain-aligned Text-Based Person Retrieval
arXiv:2507.10195v2 Announce Type: replace Abstract: In this work, we focus on text-based person retrieval, which identifies individuals based on textual descriptions. Despite advancements enabled by synthetic data for pretraining, a significant domain gap, due to variations in lighting, color, and viewpoint, limits the effectiveness of the pretrain-finetune paradigm. To overcome this issue, we propose a unified pipeline incorporating domain adaptation at both image and region levels. Our method features two key components: Domain-aware Diffusion (DaD) for image-level adaptation, which aligns i — Shuyu Yang, Yaxiong Wang, Yongrui Li, Li Zhu, Zhedong Zheng
View PDF HTML (experimental)
Abstract:In this work, we focus on text-based person retrieval, which identifies individuals based on textual descriptions. Despite advancements enabled by synthetic data for pretraining, a significant domain gap, due to variations in lighting, color, and viewpoint, limits the effectiveness of the pretrain-finetune paradigm. To overcome this issue, we propose a unified pipeline incorporating domain adaptation at both image and region levels. Our method features two key components: Domain-aware Diffusion (DaD) for image-level adaptation, which aligns image distributions between synthetic and real-world domains, e.g., CUHK-PEDES, and Multi-granularity Relation Alignment (MRA) for region-level adaptation, which aligns visual regions with descriptive sentences, thereby addressing disparities at a finer granularity. This dual-level strategy effectively bridges the domain gap, achieving state-of-the-art performance on CUHK-PEDES, ICFG-PEDES, and RSTPReid datasets. The dataset, model, and code are available at this https URL.
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2507.10195 [cs.CV]
(or arXiv:2507.10195v2 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2507.10195
arXiv-issued DOI via DataCite
Submission history
From: Shuyu Yang [view email] [v1] Mon, 14 Jul 2025 12:03:04 UTC (2,655 KB) [v2] Mon, 30 Mar 2026 07:56:28 UTC (3,007 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivVeon unveils new NOC, AI company in Uzbekistan - Telecompaper
<a href="https://news.google.com/rss/articles/CBMikgFBVV95cUxNckhUalAzNVlPTHlYSnluNnFsWUdFMF81NGYxSjB2c0hzWWMxeXN0VVU0U0pIYTJtV1pQWnpSTTBHTmJhMU9tMzNpck5QS2tUQ0tXMG0ydFhoOUNOYVhlaVc5TmlEOHVxeENNeFQ2RnJZNzFqemwwRno0b015QXNZUnA5OWkxcDVrQnVzSlpCU1JjZw?oc=5" target="_blank">Veon unveils new NOC, AI company in Uzbekistan</a> <font color="#6f6f6f">Telecompaper</font>
Trump shows AI map with Canada, Greenland and Venezuela under U.S flag. - The City Paper Bogotá
<a href="https://news.google.com/rss/articles/CBMirAFBVV95cUxNSzhQWExhenRzQ0FqZ013VUNqZ2xfOHZBdVkxMVRGQ0FKeXNXVkQ4dzBJQjNyS3dhRk1jNlNBcHFLLURmbWx4ejU0RjQ2LUptTjZ2X1FzdDUtY2dqUDhrTm9oWGhzUU13UXd0YjJoQ2tnMkpKQXlzd0k3V3pyWDZxR3lkZUdTZDBGMlZ0cTdHU2lTeDV0S0xYY2VEM1Bsc2tBaUwyNXozZnNqY0JX?oc=5" target="_blank">Trump shows AI map with Canada, Greenland and Venezuela under U.S flag.</a> <font color="#6f6f6f">The City Paper Bogotá</font>
Telia agrees Swedish sovereign AI deal with Brookfield - Telecompaper
<a href="https://news.google.com/rss/articles/CBMingFBVV95cUxQY1ZCaEFJUVJLNFJUOWoyLVBqVGxCdjQ1QUJ6WEdPdVFvU0ZMVnZpZG9IY1YxaFlFOXhqME1lRXBWd2x5Tjg2bDdnaWlzQUxwQkZPWG1KU1RwN25BelRhREJyTXEwZWI2Vk9nTTlLdnI1RDFhQnpWa3hpa1ZwTHc1cGNNVmVtckFianM2YlNVZXJFZ3U2X2NmMl9BcUN4QQ?oc=5" target="_blank">Telia agrees Swedish sovereign AI deal with Brookfield</a> <font color="#6f6f6f">Telecompaper</font>
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
Telia agrees Swedish sovereign AI deal with Brookfield - Telecompaper
<a href="https://news.google.com/rss/articles/CBMingFBVV95cUxQY1ZCaEFJUVJLNFJUOWoyLVBqVGxCdjQ1QUJ6WEdPdVFvU0ZMVnZpZG9IY1YxaFlFOXhqME1lRXBWd2x5Tjg2bDdnaWlzQUxwQkZPWG1KU1RwN25BelRhREJyTXEwZWI2Vk9nTTlLdnI1RDFhQnpWa3hpa1ZwTHc1cGNNVmVtckFianM2YlNVZXJFZ3U2X2NmMl9BcUN4QQ?oc=5" target="_blank">Telia agrees Swedish sovereign AI deal with Brookfield</a> <font color="#6f6f6f">Telecompaper</font>
Vector researchers presenting more than 98 papers at NeurIPS 2024
Leading researchers from Vector are presenting groundbreaking research at this year’s Conference on Neural Information Processing Systems (NeurIPS). The conference, taking place December 10-15 in Vancouver and online, showcases innovative […] The post Vector researchers presenting more than 98 papers at NeurIPS 2024 appeared first on Vector Institute for Artificial Intelligence .
Findings on phonetic reduction in speech could help make AI voices more natural-sounding - Tech Xplore
<a href="https://news.google.com/rss/articles/CBMiggFBVV95cUxNWDhTMmFhN1ZJVC0wSU8yTU5HZEFnRUN0UDRldm5XZnplNTZYYXMtSjktZEJCNEJPdTRnNGhtMGhGWWpla0UzVk1zSDVDaGs1MXZScTdKa0FCalFJN2czUjZmaVlqbDBUcWlyOWtmNjh2U0J0NkZqVktWZFVfdE9jSmVn?oc=5" target="_blank">Findings on phonetic reduction in speech could help make AI voices more natural-sounding</a> <font color="#6f6f6f">Tech Xplore</font>
Titans + MIRAS: Helping AI have long-term memory - research.google
<a href="https://news.google.com/rss/articles/CBMigAFBVV95cUxQT09kQldONHlTQlJsNmJTTzRRRWowZVBpaWN0YVJkSkJ5NUtGYmlhcF9tRzg4aEJTQXJZZU51czlFN2xoOUpmMGx6QzY4WU9FZTR2SWl3WXhLRmtmVFRsSjVxUnhHd2N2TTVKek1tTmprdW01VnVpZGY0U3JOZXNuOQ?oc=5" target="_blank">Titans + MIRAS: Helping AI have long-term memory</a> <font color="#6f6f6f">research.google</font>

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!