| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| General Ability Suite (ARC, HellaSwag, PIQA, BoolQ, WinoGrande, COPA, OBQA, SciQ) various (test) | FEM-SM | ARC-C Accuracy36.4 | 19 | 3d ago | |
| General Ability Suite ARC, HellaSwag, PIQA, BoolQ, WinoGrande, COPA, OBQA, SciQ | - | ARC-C Accuracy- | 0 | 4d ago |