| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Test-Time Scaling Selection | RouterBench Quality-Priority | Reward0.7924 | 16 | |
| Test-Time Scaling Selection | RouterBench Cost-Sensitive | Reward0.8337 | 16 | |
| Predictive LLM Routing | RouterBench | AUC91.91 | 14 | |
| LLM Routing | RouterBench | QNC1.66 | 14 | |
| Text-based LLM Routing | RouterBench | Utility Score55.58 | 12 | |
| Routing | RouterBench (test) | Accuracy91.4 | 11 | |
| Ranking quality gain estimation | RouterBench | Ranking Quality Gain (%)31.05 | 9 | |
| LLM Routing | RouterBench Out-of-domain | nAUC75.6 | 9 | |
| Predictive Model Routing | RouterBench Quality-Priority | Reward0.6121 | 8 | |
| Predictive Model Routing | RouterBench Cost-Sensitive | Reward0.6226 | 8 | |
| Aggregate Model Evaluation | RouterBench subsampled 2500 s | Accuracy79.1 | 8 | |
| LLM Routing | RouterBench held-out (test) | Accuracy91.3 | 6 | |
| Routing | RouterBench | Accuracy- | 0 |