Share your thoughts, 1 month free Claude Pro on usSee more

ICLR

Benchmarks

Task Name	Dataset Name	SOTA Result
Scientific Manuscript Reviewing	ICLR 2026 (test)	Actionability Score0.55	38
Paper Quality Evaluation	ICLR 2025 (test)	Kendall Tau Correlation48.08	32
Paper Acceptance Decision	ICLR submissions 2025	Accuracy89.8	17
Binary deficiency detection	gold-labeled ICLR (test)	Accuracy86	16
Node Retrieval	ICLR 2025 (500 papers)	Recall @ 90.172	16
Paper Acceptance Decision	ICLR 2025 (test)	Accuracy71.92	15
Aspect-based Review Scoring	ICLR-5k	Actionability Score2.26	12
Scientific Idea Generation	ICLR 2024	Absolute Novelty4.22	12
Multi-turn role-play	ICLR	Success Rate (SR)96.2	12
Review Score Generation	ICLR 2025	Average Review Score6.4	10
Scientific Review Feedback Generation	ICLR LLM-as-a-Judge 2025 (test)	Actionability Score3.38	9
Scientific Review Feedback Generation	ICLR Human Evaluation 2025 (test)	Actionability3.46	9
Fine-grained multi-label classification	ICLR gold-labeled (test)	Jaccard Similarity74.24	8
Holistic Technical Quality Evaluation	ICLR 2025	Originality3.35	8
Discrimination between Good Faith and Problematic agents (Peer Review)	ICLR 20.2:1	Cohen's d1.82	6
Insight Discovery and Guidance	ICLR Poster 2025	Guided Paper Percentage82.4	6
Insight Discovery and Guidance	ICLR Spotlight 2025	Percentage Guided82.1	6
Insight Discovery and Guidance	ICLR Oral 2025	Guidance Rate82.2	6
Insight Discovery and Guidance	ICLR Overall 2025	Percentage Guided82.4	6
Empathetic Dialogue	ICLR	Success Rate (SR)96.7	5
Issue Identification	ICLR 100-paper corpus 2026	Caught3,024	4
Faithfulness discrimination	ICLR	AUC54.4	4
Research solution evaluation	ICLR problems 2026 N=20 (test)	Feasibility Win%56	4
Citation Coverage Evaluation	ICLR 2025	Avg Cites45.73	3
Coverage-based Alignment	ICLR 50 submissions 2026	Str-Cov88.6	3

Showing 25 of 32 rows