ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models

About

The pace of scientific research, vital for improving human life, is complex, slow, and needs specialized expertise. Meanwhile, novel, impactful research often stems from both a deep understanding of prior work, and a cross-pollination of ideas across domains and fields. To enhance the productivity of researchers, we propose ResearchAgent, which leverages the encyclopedic knowledge and linguistic reasoning capabilities of Large Language Models (LLMs) to assist them in their work. This system automatically defines novel problems, proposes methods and designs experiments, while iteratively refining them based on the feedback from collaborative LLM-powered reviewing agents. Specifically, starting with a core scientific paper, ResearchAgent is augmented not only with relevant publications by connecting information over an academic graph but also entities retrieved from a knowledge store derived from shared underlying concepts mined across numerous papers. Then, mimicking a scientific approach to improving ideas with peer discussions, we leverage multiple LLM-based ReviewingAgents that provide reviews and feedback via iterative revision processes. These reviewing agents are instantiated with human preference-aligned LLMs whose criteria for evaluation are elicited from actual human judgments via LLM prompting. We experimentally validate our ResearchAgent on scientific publications across multiple disciplines, showing its effectiveness in generating novel, clear, and valid ideas based on both human and model-based evaluation results. Our initial foray into AI-mediated scientific research has important implications for the development of future systems aimed at supporting researchers in their ideation and operationalization of novel work.

Jinheon Baek, Sujay Kumar Jauhar, Silviu Cucerzan, Sung Ju Hwang• 2024

Related benchmarks

Task	Dataset	Result
Time Series Forecasting	Stock	MAE5.588	53
Financial Strategy Generation	Crypto	∆VaR0.01	34
System-level data generation	Stock	Marginal Fidelity0.481	17
System-level data generation	LOB	Marginal0.814	17
Time Series Forecasting	Crypto	RMSE0.222	17
Time Series Forecasting	LOB	RMSE0.192	17
Theorem Generation	Future Theorem Prediction dataset (test)	Structure Score0.64	15
Scientific Idea Generation	AI-Scientist	Absolute Novelty4.01	14
Idea Generation Assessment	AI-Idea-Bench 2025	Motivation Score3.78	12
Scientific Idea Generation	ICLR 2024	Absolute Novelty4.08	12

Showing 10 of 28 rows

Other info

Follow for update

@wizwand_team Discord