| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MMLU | MMLU Accuracy78.9 | 45 | 3d ago | ||
| MMMLU | DOS-CPT | MMMLU General Knowledge Accuracy82.25 | 29 | 1mo ago | |
| C-Eval (test) | Qwen-14B | Accuracy71.8 | 13 | 1mo ago | |
| General-purpose benchmarks average (test) | Qwen3 8B | Accuracy73.8 | 12 | 1mo ago | |
| MMLU non-IID distribution, alpha=0.1 | FedAlign-MoE | Accuracy39.79 | 10 | 25d ago | |
| MMLU Computer Security | NPO+KL w/ RNA | Accuracy46 | 8 | 4d ago | |
| MMLU Corporate Biology | RMU w/ RNA | Accuracy60.4 | 8 | 4d ago | |
| MMLU Perturbed | NPO+KL w/ RNA | Accuracy53.5 | 8 | 4d ago | |
| C-Eval (val) | LLaMA2-13B | Accuracy34.32 | 8 | 1mo ago | |
| CEVAL | Accuracy85.52 | 5 | 1mo ago | ||
| General Knowledge Evaluation Suite (ARC, HellaSwag, LAMBADA, PIQA, SciQ, WinoGrande, TriviaQA, WebQS, MMLU, GSM8K) | SPLA | ARC-C60.2 | 5 | 1mo ago |