| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MMLU | ID-LoRA | Accuracy60.6 | 23 | 4d ago | |
| JARVIS-VLA Benchmark 1.0 (test) | GPT-4o | Accuracy96.6 | 10 | 4d ago | |
| MMLU | Sigma-MoE-Tiny Base | EM (World Knowledge)64.81 | 4 | 4d ago | |
| HUMANITY’S LAST EXAM text-only | Score11.1 | 4 | 4d ago | ||
| GPQA Diamond | Score77.5 | 4 | 4d ago | ||
| MMLU-PRO | DeepSeek-V3.2 | Score84.6 | 4 | 4d ago | |
| MMLU-Pro | Sigma-MoE-Tiny Base | EM (World Knowledge)38.13 | 3 | 4d ago |