Automated Peer Reviewing in Paper SEA: Standardization, Evaluation, and Analysis

About

In recent years, the rapid increase in scientific papers has overwhelmed traditional review mechanisms, resulting in varying quality of publications. Although existing methods have explored the capabilities of Large Language Models (LLMs) for automated scientific reviewing, their generated contents are often generic or partial. To address the issues above, we introduce an automated paper reviewing framework SEA. It comprises of three modules: Standardization, Evaluation, and Analysis, which are represented by models SEA-S, SEA-E, and SEA-A, respectively. Initially, SEA-S distills data standardization capabilities of GPT-4 for integrating multiple reviews for a paper. Then, SEA-E utilizes standardized data for fine-tuning, enabling it to generate constructive reviews. Finally, SEA-A introduces a new evaluation metric called mismatch score to assess the consistency between paper contents and reviews. Moreover, we design a self-correction strategy to enhance the consistency. Extensive experimental results on datasets collected from eight venues show that SEA can generate valuable insights for authors to improve their papers.

Jianxiang Yu, Zichen Ding, Jiaqi Tan, Kangyang Luo, Zhenmin Weng, Chenghua Gong, Long Zeng, Renjing Cui, Chengcheng Han, Qiushi Sun, Zhiyong Wu, Yunshi Lan, Xiang Li• 2024

Related benchmarks

Task	Dataset	Result
Paper Quality Evaluation	ICLR 2025 (test)	Kendall Tau Correlation8.8	32
Paper Acceptance Decision	ICLR submissions 2025	Accuracy69.6	17
Paper Acceptance Decision	ICLR 2025 (test)	Accuracy37.07	15
Review Generation	PeerRead and OpenReview (test)	ROUGE-142.8	9

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord