| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Node Retrieval | ICLR 2025 (500 papers) | Recall @ 90.172 | 16 | |
| Paper Acceptance Decision | ICLR 2025 (test) | Accuracy71.92 | 15 | |
| Paper Quality Evaluation | ICLR 2025 (test) | Jaccard Index37.98 | 15 | |
| Multi-turn role-play | ICLR | Success Rate (SR)96.2 | 12 | |
| Review Score Generation | ICLR 2025 | Average Review Score6.4 | 10 | |
| Scientific Review Feedback Generation | ICLR LLM-as-a-Judge 2025 (test) | Actionability Score3.38 | 9 | |
| Scientific Review Feedback Generation | ICLR Human Evaluation 2025 (test) | Actionability3.46 | 9 | |
| Holistic Technical Quality Evaluation | ICLR 2025 | Originality3.35 | 8 | |
| Empathetic Dialogue | ICLR | Success Rate (SR)96.7 | 5 | |
| Citation Coverage Evaluation | ICLR 2025 | Avg Cites45.73 | 3 | |
| Coverage-based Alignment | ICLR 50 submissions 2026 | Str-Cov88.6 | 3 | |
| Score-based Alignment | ICLR 2026 (50 submissions) | R-MSE0.148 | 3 |