| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| TRACE | MagMax | C-STANCE Accuracy59 | 29 | 4d ago | |
| Open LLM Leaderboard Lighteval (test) | Mean Accuracy91.07 | 17 | 4d ago | ||
| General domain benchmarks (test) | AM-Thinking (math) | DROP Score93.3 | 16 | 2d ago | |
| MMLU-Redux | Qwen 3 14B | Accuracy83.7 | 14 | 4d ago | |
| LLM Evaluation Suite (ARC, CSQA, GSM8K, HS, MMLU, OBQA, PIQA, SIQA, TQA, WG) | Muon (OSP) | ARC45.9 | 14 | 4d ago | |
| Academic Benchmarks (test) | Camelidae-8x34B-pro | Average Score59.9 | 10 | 4d ago | |
| OpenLLM Leaderboard BBH, GPQA, IFEVAL, MMLU, MUSR (test) | BBH72.7 | 4 | 4d ago |