| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| tasks 0-shot | Accuracy63.94 | 74 | 21d ago | ||
| ARC-C, ARC-E, BoolQ, and HellaSwag | Accuracy69.48 | 28 | 22d ago | ||
| 11 Tasks zero-shot | 0-shot Average68.66 | 26 | 2mo ago | ||
| 9 Downstream Tasks Utility | DAPT (nontoxic) | Average Accuracy54.7 | 10 | 3mo ago | |
| Common-sense Zero-shot Benchmarks | OpenQ (Zero-shot)35 | 4 | 6d ago |