AgentReview: Exploring Peer Review Dynamics with LLM Agents

About

Peer review is fundamental to the integrity and advancement of scientific publication. Traditional methods of peer review analyses often rely on exploration and statistics of existing peer review data, which do not adequately address the multivariate nature of the process, account for the latent variables, and are further constrained by privacy concerns due to the sensitive nature of the data. We introduce AgentReview, the first large language model (LLM) based peer review simulation framework, which effectively disentangles the impacts of multiple latent factors and addresses the privacy issue. Our study reveals significant insights, including a notable 37.1% variation in paper decisions due to reviewers' biases, supported by sociological theories such as the social influence theory, altruism fatigue, and authority bias. We believe that this study could offer valuable insights to improve the design of peer review mechanisms. Our code is available at https://github.com/Ahren09/AgentReview.

Yiqiao Jin, Qinlin Zhao, Yiyang Wang, Hao Chen, Kaijie Zhu, Yijia Xiao, Jindong Wang• 2024

Related benchmarks

Task	Dataset	Result
Scientific Manuscript Reviewing	ICLR 2026 (test)	Actionability Score0.37	38
Paper Quality Evaluation	ICLR 2025 (test)	Kendall Tau Correlation15.79	32
Automated Peer Review Evaluation	DeepReview-13K 1.0 (test)	H-Max Technical Accuracy7.55	30
Paper Acceptance Decision	ICLR submissions 2025	Accuracy51.6	17
Paper Acceptance Decision	ICLR 2025 (test)	Accuracy53.79	15
Automated Peer Review	DeepReview-13K 2025 (test)	Technical Accuracy Win45.3	14
Novelty Report Generation	50-paper	Completeness7.59	7
Counterfactual error detection	Counterfactual dataset	Finding Detection Rate10	5
Scientific Review Generation	Human Evaluation 50 papers sampled (test)	Actionability (Elo)974	5

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord