| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Aggregate 10-Benchmark Suite | FineRouter | Average Score79.9 | 29 | 1mo ago | |
| Overall Performance across 12 Datasets | Deep Lasso | Rank1.42 | 29 | 3mo ago | |
| MMLU, GSM, HellaSwag, TruthfulQA, ARC-C, CodeX | MADS8B | Improvement5.32 | 18 | 2d ago | |
| Average SST2, AGNEWS, GSM8K | HS Score26.47 | 11 | 3mo ago |