FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration
About
Scientific idea generation (SIG) is critical to AI-driven autonomous research, yet existing approaches are often constrained by a static retrieval-then-generation paradigm, leading to homogeneous and insufficiently divergent ideas. In this work, we propose FlowPIE, a tightly coupled retrieval-generation framework that treats literature exploration and idea generation as a co-evolving process. FlowPIE expands literature trajectories via a flow-guided Monte Carlo Tree Search (MCTS) inspired by GFlowNets, using the quality of current ideas assessed by an LLM-based generative reward model (GRM) as a supervised signal to guide adaptive retrieval and construct a diverse, high-quality initial population. Based on this population, FlowPIE models idea generation as a test-time idea evolution process, applying selection, crossover, and mutation with the isolation island paradigm and GRM-based fitness computation to incorporate cross-domain knowledge. It effectively mitigates the information cocoons arising from over-reliance on parametric knowledge and static literature. Extensive evaluations demonstrate that FlowPIE consistently produces ideas with higher novelty, feasibility and diversity compared to strong LLM-based and agent-based frameworks, while enabling reward scaling during test time.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Idea Generation Assessment | AI-Idea-Bench 2025 | Motivation Score4.44 | 12 | |
| Scientific Idea Generation | AI-Idea-Bench 2025 | Reward Novelty0.75 | 7 | |
| Scientific Idea Generation | IdeaBench | Semantic Similarity0.559 | 6 |