| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Reasoning and Knowledge Evaluation | Reasoning & Knowledge Suite (ARC-E, WG, SIQA, Hella., OBQA, CSQA, BA, MMLU) | ARC-Easy Accuracy51.05 | 15 | |
| Reasoning and Knowledge | Reasoning and Knowledge Suite (MMLU, ARC-C, ARC-E, BoolQ, CSQA, HSwag, PIQA, SocIQ, Wino) (various) | MMLU75.78 | 14 |