| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| LAMA (full) | Coherence Boosting (CB) | Accuracy37.57 | 24 | 3d ago | |
| Empathetic Dialogue (EMP) | MCTS-Driven Knowledge Retrieval | BERTScore Precision (avg)85.27 | 16 | 3d ago | |
| DailyDialog | MCTS-Driven Knowledge Retrieval | BERTScore Precision (avg)84.29 | 16 | 3d ago | |
| MemoTrap | CoDA | Proverb Score42.5 | 12 | 2d ago | |
| OK-VQA v1.1 (test) | RA-VQA-v2 (T5-large) | Recall@589.32 | 10 | 3d ago | |
| STARK-PRIME official (test) | AVATAR | Hit@118.44 | 8 | 3d ago | |
| STARK MAG official (test) | AVATAR | Hit@144.36 | 8 | 3d ago | |
| STARK Amazon official (test) | AVATAR | Hit@149.87 | 8 | 3d ago | |
| ChineseSimpleQA | ERNIE 5.0 | Accuracy86.03 | 5 | 3d ago | |
| SimpleQA | ERNIE 5.0 | Accuracy74.01 | 5 | 3d ago | |
| Wikipedia (45 country-specific domains, 89 queries) (test) | GraphRAGDRIFT-dec | Comprehensiveness0.9 | 5 | 3d ago | |
| S (test) | BERT-Large | Acc@1046.8 | 5 | 3d ago | |
| Woz 2.1 (test) | Entriever | Joint Accuracy0.8024 | 3 | 3d ago | |
| In-Car (test) | Entriever | Joint Accuracy78.66 | 3 | 3d ago | |
| Camrest (test) | Entriever | Joint Accuracy83.17 | 3 | 3d ago | |
| MobileCS (test) | Entriever | Joint Accuracy77.21 | 3 | 3d ago |