Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Evolving Idea Graphs with Learnable Edits-and-Commits for Multi-Agent Scientific Ideation

About

LLM-empowered multi-agent systems offer new potential to accelerate scientific discovery by generating novel research ideas. However, existing methods typically coordinate agents through temporary texts, such as drafts or chat logs; it is difficult to pinpoint the weaknesses in the generated ideas and how the agents refine them. To this end, we introduce \textbf{Evolving Idea Graphs} (EIG), a graph-based multi-agent scientific ideation framework that can generate high-performance research ideas across various benchmark-native metrics, such as novelty, feasibility, and clarity. Instead of coordinating solely through texts, EIG represents a partially formed proposal as an evolving idea graph, where nodes capture scientific claims and edges encode relations (e.g., support and conflict), enabling unresolved weaknesses to remain identifiable throughout the idea evolving process. Specifically, a learned two-head controller operates over the evolving graph to guide the ideation: one head selects graph edits for agents to execute, while the other decides when the graph is ready for commit as final proposal synthesis. On AI Idea Bench 2025 and LiveIdeaBench, EIG outperforms all compared systems on both automatic benchmark scores and blind expert ratings. Ablations further show that explicit graph state provides the main performance gains, and learned edit-and-commit control adds consistent improvements.

Jiangwen Dong, Bo Li, Wanyu Lin• 2026

Related benchmarks

TaskDatasetResultRank
Research Proposal GenerationAI Idea Bench (AIIB) held-out 2025
AIIB Score7.69
7
Research Proposal GenerationLiveIdeaBench held-out
Live Score7.12
7
Research Proposal GenerationAIIB and LiveIdeaBench Combined
Average Score7.41
7
Scientific Proposal GenerationAI Idea Bench and LiveIdeaBench 24 held-out benchmark groups 2025
Novelty3.04
7
Showing 4 of 4 rows

Other info

Follow for update