Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models

About

The pace of scientific research, vital for improving human life, is complex, slow, and needs specialized expertise. Meanwhile, novel, impactful research often stems from both a deep understanding of prior work, and a cross-pollination of ideas across domains and fields. To enhance the productivity of researchers, we propose ResearchAgent, which leverages the encyclopedic knowledge and linguistic reasoning capabilities of Large Language Models (LLMs) to assist them in their work. This system automatically defines novel problems, proposes methods and designs experiments, while iteratively refining them based on the feedback from collaborative LLM-powered reviewing agents. Specifically, starting with a core scientific paper, ResearchAgent is augmented not only with relevant publications by connecting information over an academic graph but also entities retrieved from a knowledge store derived from shared underlying concepts mined across numerous papers. Then, mimicking a scientific approach to improving ideas with peer discussions, we leverage multiple LLM-based ReviewingAgents that provide reviews and feedback via iterative revision processes. These reviewing agents are instantiated with human preference-aligned LLMs whose criteria for evaluation are elicited from actual human judgments via LLM prompting. We experimentally validate our ResearchAgent on scientific publications across multiple disciplines, showing its effectiveness in generating novel, clear, and valid ideas based on both human and model-based evaluation results. Our initial foray into AI-mediated scientific research has important implications for the development of future systems aimed at supporting researchers in their ideation and operationalization of novel work.

Jinheon Baek, Sujay Kumar Jauhar, Silviu Cucerzan, Sung Ju Hwang• 2024

Related benchmarks

TaskDatasetResultRank
Idea Generation AssessmentAI-Idea-Bench 2025
Motivation Score3.78
12
Scientific ideationScientific Ideation 60 samples human-validated (test)
Novelty2.66
9
Best Research Idea SelectionD_group
Best Score40.12
7
Binary Research Idea ClassificationD_point NeurIPS25 and ICLR25
Acc259.48
7
Pair-wise Research Idea ComparisonD_pair Easy
Acceasy52.32
7
Research Idea RankingD_group
Listwise Score66.52
7
Ternary Research Idea ClassificationD_point
Accuracy (3 Class)54.84
7
Pair-wise Research Idea ComparisonD_pair Hard
Acchard43
7
Scientific Idea GenerationAI-Idea-Bench 2025
Reward Novelty0.49
7
Scientific Idea GenerationIdeaBench
Semantic Similarity0.558
6
Showing 10 of 10 rows

Other info

Follow for update