| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| DocRED (dev) | DREEAM (student) | F1 Score57.55 | 14 | 4d ago | |
| Claim-evidence sets (test) | Qwen3-Reranker-8B | MAP54.19 | 13 | 4d ago | |
| DocRED (test) | DREEAM (student) | F1 Score57.34 | 13 | 4d ago | |
| MR2 | MEVER | Recall@15.9 | 7 | 4d ago | |
| Mocheg | MEVER | Recall@124.4 | 7 | 4d ago | |
| ChartCheck | MEVER | Recall@156 | 7 | 4d ago | |
| AIChartClaim | MEVER | Recall@165.7 | 7 | 4d ago | |
| MR2 | MEVER | Precision@137.6 | 7 | 4d ago | |
| Mocheg | MEVER | Prec@153.1 | 7 | 4d ago | |
| MR2 | MEVER | MAP19.5 | 7 | 4d ago | |
| Mocheg | MEVER | MAP41.6 | 7 | 4d ago | |
| ChartCheck | MEVER | MAP63.6 | 7 | 4d ago | |
| AIChartClaim | MEVER | MAP71.4 | 7 | 4d ago | |
| FRAMES | HybridDeepSearcher | Evidence Coverage Rate55.8 | 6 | 2d ago | |
| FanOutQA | HybridDeepSearcher | Evidence Coverage Rate61 | 6 | 2d ago | |
| MuSiQue | HybridDeepSearcher | Evidence Coverage Rate40.7 | 6 | 2d ago | |
| DocRED v1 (test) | SAIS-RoBERTalarge | F1 Score55.67 | 6 | 4d ago | |
| DocRED v1 (dev) | SAIS-RoBERTalarge | F1 Score55.84 | 6 | 4d ago |