| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MMMLU | DOS-CPT | MMMLU General Knowledge Accuracy82.25 | 29 | 4d ago | |
| MMLU | MMLU Accuracy78.9 | 20 | 2d ago | ||
| C-Eval (test) | Qwen-14B | Accuracy71.8 | 13 | 4d ago | |
| General-purpose benchmarks average (test) | Qwen3 8B | Accuracy73.8 | 12 | 4d ago | |
| C-Eval (val) | LLaMA2-13B | Accuracy34.32 | 8 | 4d ago | |
| CEVAL | Accuracy85.52 | 5 | 4d ago | ||
| General Knowledge Evaluation Suite (ARC, HellaSwag, LAMBADA, PIQA, SciQ, WinoGrande, TriviaQA, WebQS, MMLU, GSM8K) | SPLA | ARC-C60.2 | 5 | 4d ago |