| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Paper Quality Evaluation | ICLR 2025 (test) | Kendall Tau Correlation48.08 | 32 | |
| Paper Acceptance Decision | ICLR submissions 2025 | Accuracy89.8 | 17 | |
| Binary deficiency detection | gold-labeled ICLR (test) | Accuracy86 | 16 | |
| Node Retrieval | ICLR 2025 (500 papers) | Recall @ 90.172 | 16 | |
| Paper Acceptance Decision | ICLR 2025 (test) | Accuracy71.92 | 15 | |
| Scientific Idea Generation | ICLR 2024 | Absolute Novelty4.22 | 12 | |
| Multi-turn role-play | ICLR | Success Rate (SR)96.2 | 12 | |
| Review Score Generation | ICLR 2025 | Average Review Score6.4 | 10 | |
| Scientific Review Feedback Generation | ICLR LLM-as-a-Judge 2025 (test) | Actionability Score3.38 | 9 | |
| Scientific Review Feedback Generation | ICLR Human Evaluation 2025 (test) | Actionability3.46 | 9 | |
| Fine-grained multi-label classification | ICLR gold-labeled (test) | Jaccard Similarity74.24 | 8 | |
| Holistic Technical Quality Evaluation | ICLR 2025 | Originality3.35 | 8 | |
| Discrimination between Good Faith and Problematic agents (Peer Review) | ICLR 20.2:1 | Cohen's d1.82 | 6 | |
| Insight Discovery and Guidance | ICLR Poster 2025 | Guided Paper Percentage82.4 | 6 | |
| Insight Discovery and Guidance | ICLR Spotlight 2025 | Percentage Guided82.1 | 6 | |
| Insight Discovery and Guidance | ICLR Oral 2025 | Guidance Rate82.2 | 6 | |
| Insight Discovery and Guidance | ICLR Overall 2025 | Percentage Guided82.4 | 6 | |
| Empathetic Dialogue | ICLR | Success Rate (SR)96.7 | 5 | |
| Issue Identification | ICLR 100-paper corpus 2026 | Caught3,024 | 4 | |
| Faithfulness discrimination | ICLR | AUC54.4 | 4 | |
| Research solution evaluation | ICLR problems 2026 N=20 (test) | Feasibility Win%56 | 4 | |
| Citation Coverage Evaluation | ICLR 2025 | Avg Cites45.73 | 3 | |
| Coverage-based Alignment | ICLR 50 submissions 2026 | Str-Cov88.6 | 3 | |
| Score-based Alignment | ICLR 2026 (50 submissions) | R-MSE0.148 | 3 | |
| Research idea quality evaluation | ICLR Rejected Papers 2025 | Mean Score2.689 | 2 |