| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Model Routing Suite MathQA, LogiQA, MedQA, PIQA, TruthQA, MMLU, GSM8k, GPQA, ASDiv, SoQA | KMeans | Overall Accuracy66.2 | 18 | 4d ago | |
| OOD Set AIME Humanity's Last Exam SimpleQA OlympiadBench (test) | SCOPE | Avg. A50.8 | 11 | 4d ago | |
| SCOPE-60K 5% split (test) | SCOPE | Avg. Accuracy75 | 11 | 4d ago | |
| Model-Query Evaluation (test) | LOCUS | Routing Accuracy (%)64.7 | 9 | 4d ago | |
| EmbedLLM (test) | DNA | Accuracy67.2 | 4 | 4d ago |