| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MMLU-CoT, GSM8k, HellaSwag, WinoGrande zero-shot Llama-3.1-8B-Instruct | MMLU-CoT Accuracy72.76 | 30 | 1mo ago | ||
| Vicuna-7B Zero-shot (PIQA, ARC-C, ARC-E, HellaS, WinoG, BoolQ, LAMBADA, C4) | L2QER | PIQA Accuracy76.33 | 6 | 22d ago |