| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| 11 Downstream Tasks Aggregate | Average Accuracy64.6 | 32 | 3d ago | ||
| HellaSwag | Phi-3 Mini-4k | Accuracy76.7 | 13 | 3d ago | |
| OBQA | TinyLlama-1.1B | Accuracy25.2 | 7 | 3d ago | |
| BoolQ | LLaMA-MoE 2/8 | Accuracy61.93 | 7 | 3d ago | |
| LogiQA | DIVE 2/8 | Accuracy22.12 | 7 | 3d ago | |
| MathQA | TinyLlama-1.1B | Accuracy24.32 | 7 | 3d ago | |
| ARC Challenge | TinyLlama-1.1B | Accuracy35.67 | 7 | 3d ago | |
| ARC Easy | TinyLlama-1.1B | Accuracy61.66 | 7 | 3d ago | |
| WinoGrande | TinyLlama-1.1B | Accuracy59.43 | 7 | 3d ago | |
| PIQA | TinyLlama-1.1B | Accuracy72.6 | 7 | 3d ago | |
| SciQ | TinyLlama-1.1B | Accuracy89.3 | 7 | 3d ago |