Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessFirst-Time Payees, Payouts, and Why Clean Transactions Still Turn Into Fraud LossesDEV CommunityHandling Extreme Class Imbalance in Fraud DetectionDEV CommunityAntropic's Claude Code leaked and Axios NPM InflitrationDEV CommunityReal-Time Fraud Scoring Latency: What 47ms Actually MeansDEV CommunityPause, Save, Resume: The Definitive Guide to StashingDEV CommunitySouth Korean trade data: chip shipments hit a record-high value of $32.83B in March 2026, up 151.4% YoY, pushing total exports to a record $86.13B, up 48.3% YoY (Steven Borowiec/Nikkei Asia)Techmeme5 Rust patterns that replaced my Python scriptsDEV CommunityI automated my entire dev workflow with Claude Code hooksDEV CommunityHugging Face Releases TRL v1.0: A Unified Post-Training Stack for SFT, Reward Modeling, DPO, and GRPO WorkflowsMarkTechPostQ2, Day 1: When Concepts Have to Become CodeDEV CommunityProgress adds AI search & personalisation to Sitefinity - IT Brief AsiaGoogle News: Generative AIInteractive Data Chart Generator (Pure JavaScript Canvas Tool)Hackernoon AIBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessFirst-Time Payees, Payouts, and Why Clean Transactions Still Turn Into Fraud LossesDEV CommunityHandling Extreme Class Imbalance in Fraud DetectionDEV CommunityAntropic's Claude Code leaked and Axios NPM InflitrationDEV CommunityReal-Time Fraud Scoring Latency: What 47ms Actually MeansDEV CommunityPause, Save, Resume: The Definitive Guide to StashingDEV CommunitySouth Korean trade data: chip shipments hit a record-high value of $32.83B in March 2026, up 151.4% YoY, pushing total exports to a record $86.13B, up 48.3% YoY (Steven Borowiec/Nikkei Asia)Techmeme5 Rust patterns that replaced my Python scriptsDEV CommunityI automated my entire dev workflow with Claude Code hooksDEV CommunityHugging Face Releases TRL v1.0: A Unified Post-Training Stack for SFT, Reward Modeling, DPO, and GRPO WorkflowsMarkTechPostQ2, Day 1: When Concepts Have to Become CodeDEV CommunityProgress adds AI search & personalisation to Sitefinity - IT Brief AsiaGoogle News: Generative AIInteractive Data Chart Generator (Pure JavaScript Canvas Tool)Hackernoon AI

FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration

ArXiv CS.AIby Qiyao Wang, Hongbo Wang, Longze Chen, Zhihao Yang, Guhong Chen, Hamid Alinejad-Rokny, Hui Li, Yuan Lin, Min YangApril 1, 20261 min read0 views
Source Quiz

arXiv:2603.29557v1 Announce Type: new Abstract: Scientific idea generation (SIG) is critical to AI-driven autonomous research, yet existing approaches are often constrained by a static retrieval-then-generation paradigm, leading to homogeneous and insufficiently divergent ideas. In this work, we propose FlowPIE, a tightly coupled retrieval-generation framework that treats literature exploration and idea generation as a co-evolving process. FlowPIE expands literature trajectories via a flow-guided Monte Carlo Tree Search (MCTS) inspired by GFlowNets, using the quality of current ideas assessed by an LLM-based generative reward model (GRM) as a supervised signal to guide adaptive retrieval and construct a diverse, high-quality initial population. Based on this population, FlowPIE models idea

View PDF HTML (experimental)

Abstract:Scientific idea generation (SIG) is critical to AI-driven autonomous research, yet existing approaches are often constrained by a static retrieval-then-generation paradigm, leading to homogeneous and insufficiently divergent ideas. In this work, we propose FlowPIE, a tightly coupled retrieval-generation framework that treats literature exploration and idea generation as a co-evolving process. FlowPIE expands literature trajectories via a flow-guided Monte Carlo Tree Search (MCTS) inspired by GFlowNets, using the quality of current ideas assessed by an LLM-based generative reward model (GRM) as a supervised signal to guide adaptive retrieval and construct a diverse, high-quality initial population. Based on this population, FlowPIE models idea generation as a test-time idea evolution process, applying selection, crossover, and mutation with the isolation island paradigm and GRM-based fitness computation to incorporate cross-domain knowledge. It effectively mitigates the information cocoons arising from over-reliance on parametric knowledge and static literature. Extensive evaluations demonstrate that FlowPIE consistently produces ideas with higher novelty, feasibility and diversity compared to strong LLM-based and agent-based frameworks, while enabling reward scaling during test time.

Comments: 30 pages, 11 figures, 15 tables

Subjects:

Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

Cite as: arXiv:2603.29557 [cs.AI]

(or arXiv:2603.29557v1 [cs.AI] for this version)

https://doi.org/10.48550/arXiv.2603.29557

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Qiyao Wang [view email] [v1] Tue, 31 Mar 2026 10:37:47 UTC (1,987 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
FlowPIE: Te…modelannouncevaluationautonomousagentarxivArXiv CS.AI

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 263 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models