Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessGiving up on EA after 13 yearsLessWrong AIThe End of the "I Am Not a Robot" Box: Why Your Next Login Will Require 5 SquatsDEV CommunityInstagram DMs to Amazon Connect ChatDEV CommunityThe Nines Are Lying to You: What 99.9% Uptime Actually CostsDEV CommunityThe jury verdicts against Meta and YouTube recognized some platform design features as defective, distinct from what Section 230 was created to protect (Casey Newton/Platformer)TechmemeBefore You Upgrade Hardware, Fix the SoftwareDEV Community2026년, Postman 버릴 때? Axios npm 공격 후 안전한 API 테스트 및 마이그레이션DEV Community[AINews] The Claude Code Source LeakLatent SpaceWhy “On Budget” Doesn’t Mean “On Return”DEV CommunityGoogle AI Releases Veo 3.1 Lite: Giving Developers Low Cost High Speed Video Generation via The Gemini APIMarkTechPostGoogle AI Releases Veo 3.1 Lite: Giving Developers Low Cost High Speed Video Generation via The Gemini API - MarkTechPostGoogle News: GeminiWe Were About to Buy an Automation Platform. We Already Had One.DEV CommunityBlack Hat USADark ReadingBlack Hat AsiaAI BusinessGiving up on EA after 13 yearsLessWrong AIThe End of the "I Am Not a Robot" Box: Why Your Next Login Will Require 5 SquatsDEV CommunityInstagram DMs to Amazon Connect ChatDEV CommunityThe Nines Are Lying to You: What 99.9% Uptime Actually CostsDEV CommunityThe jury verdicts against Meta and YouTube recognized some platform design features as defective, distinct from what Section 230 was created to protect (Casey Newton/Platformer)TechmemeBefore You Upgrade Hardware, Fix the SoftwareDEV Community2026년, Postman 버릴 때? Axios npm 공격 후 안전한 API 테스트 및 마이그레이션DEV Community[AINews] The Claude Code Source LeakLatent SpaceWhy “On Budget” Doesn’t Mean “On Return”DEV CommunityGoogle AI Releases Veo 3.1 Lite: Giving Developers Low Cost High Speed Video Generation via The Gemini APIMarkTechPostGoogle AI Releases Veo 3.1 Lite: Giving Developers Low Cost High Speed Video Generation via The Gemini API - MarkTechPostGoogle News: GeminiWe Were About to Buy an Automation Platform. We Already Had One.DEV Community

Gen-Searcher: Reinforcing Agentic Search for Image Generation

arXivMarch 31, 20262 min read0 views
Source Quiz

arXiv:2603.28767v1 Announce Type: new Abstract: Recent image generation models have shown strong capabilities in generating high-fidelity and photorealistic images. However, they are fundamentally constrained by frozen internal knowledge, thus often failing on real-world scenarios that are knowledge-intensive or require up-to-date information. In this paper, we present Gen-Searcher, as the first attempt to train a search-augmented image generation agent, which performs multi-hop reasoning and search to collect the textual knowledge and reference images needed for grounded generation. To achiev — Kaituo Feng, Manyuan Zhang, Shuang Chen, Yunlong Lin, Kaixuan Fan, Yilei Jiang, Hongyu Li, Dian Zheng, Chenyang Wang, Xiangyu Yue

View PDF

Abstract:Recent image generation models have shown strong capabilities in generating high-fidelity and photorealistic images. However, they are fundamentally constrained by frozen internal knowledge, thus often failing on real-world scenarios that are knowledge-intensive or require up-to-date information. In this paper, we present Gen-Searcher, as the first attempt to train a search-augmented image generation agent, which performs multi-hop reasoning and search to collect the textual knowledge and reference images needed for grounded generation. To achieve this, we construct a tailored data pipeline and curate two high-quality datasets, Gen-Searcher-SFT-10k and Gen-Searcher-RL-6k, containing diverse search-intensive prompts and corresponding ground-truth synthesis images. We further introduce KnowGen, a comprehensive benchmark that explicitly requires search-grounded external knowledge for image generation and evaluates models from multiple dimensions. Based on these resources, we train Gen-Searcher with SFT followed by agentic reinforcement learning with dual reward feedback, which combines text-based and image-based rewards to provide more stable and informative learning signals for GRPO training. Experiments show that Gen-Searcher brings substantial gains, improving Qwen-Image by around 16 points on KnowGen and 15 points on WISE. We hope this work can serve as an open foundation for search agents in image generation, and we fully open-source our data, models, and code.

Comments: Project page: this https URL Code: this https URL

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.28767 [cs.CV]

(or arXiv:2603.28767v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.28767

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Kaituo Feng [view email] [v1] Mon, 30 Mar 2026 17:59:56 UTC (8,187 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Gen-Searche…researchpaperarxivcomputer-vi…image-recog…arXiv

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 216 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Research Papers