| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| RAGRouter-Bench | Savings (%)60 | 5 | 12d ago | ||
| FRAMES In-Distribution (test) | CPT (90%)77.9 | 4 | 1mo ago | ||
| MATH-500 In-Distribution (test) | CPT (90%)76.15 | 4 | 1mo ago | ||
| AIME In-Distribution (test) | CPT (90%)87.22 | 4 | 1mo ago | ||
| LSAT In-Distribution (test) | CPT (90%)80.14 | 4 | 1mo ago | ||
| MMLU-Pro In-Distribution (test) | CPT (90%)83.57 | 4 | 1mo ago | ||
| MMLU-Redux In-Distribution (test) | CPT (90%)75.06 | 4 | 1mo ago | ||
| MMLU In-Distribution (test) | CPT (90%)76.3 | 4 | 1mo ago | ||
| GPQA Diamond In-Distribution (test) | CPT (90%)80.36 | 4 | 1mo ago | ||
| FRAMES OOD | CPT 85%68.74 | 4 | 1mo ago | ||
| MATH-500 OOD | CPT (85%)63.55 | 4 | 1mo ago | ||
| LSAT OOD | CPT 85%70.54 | 4 | 1mo ago | ||
| MMLU-Pro OOD | CPT Score (85%)74.4 | 4 | 1mo ago | ||
| MMLU-Redux OOD | CPT (85%)63.09 | 4 | 1mo ago | ||
| MMLU OOD | CPT (85%) Score63.55 | 4 | 1mo ago | ||
| GPQA-Diamond OOD | CPT Accuracy (85%)74.3 | 4 | 1mo ago | ||
| FRAMES | CPT (95%)88.84 | 4 | 1mo ago | ||
| MATH 500 | CPT (95%)87.63 | 4 | 1mo ago | ||
| AIME | CPT (95%)93.61 | 4 | 1mo ago | ||
| LSAT | CPT (95%)90.01 | 4 | 1mo ago | ||
| MMLU-Pro | CPT (95%)91.42 | 4 | 1mo ago | ||
| MMLU-Redux | CPT (95%)86.79 | 4 | 1mo ago | ||
| MMLU | CPT Score (95%)88.15 | 4 | 1mo ago | ||
| GPQA Diamond | CPT (95%)89.19 | 4 | 1mo ago | ||
| FRAMES | CPT (90%)78.61 | 4 | 1mo ago |