| HellaSwag | COLLATE | Accuracy99.21 | | 1,460 | 2d ago |
| WinoGrande | | Accuracy94.1 | | 776 | 2d ago |
| PIQA | | Accuracy94.9 | | 647 | 2d ago |
| CSQA | Token-ICS | Accuracy96 | | 366 | 3d ago |
| Common Sense Reasoning Tasks | Dual LoRA | Avg Score93 | | 241 | 2d ago |
| Winogrande | | Accuracy85.3 | | 231 | 3d ago |
| Commonsense Reasoning (BoolQ, PIQA, SIQA, HellaS., WinoG., ARC-e, ARC-c, OBQA) (test) | PaLM (540B) | BoolQ Accuracy88 | | 138 | 3d ago |
| CommonSenseQA | | Accuracy91.2 | | 132 | 2d ago |
| ARC Challenge | SGE | Accuracy93.28 | | 132 | 2d ago |
| StrategyQA | PaLM 2 | Accuracy90.4 | | 125 | 2d ago |
| CSQA (test) | KEAR | Accuracy89.4 | | 111 | 3d ago |
| SocialIQA | | Accuracy88.1 | | 97 | 3d ago |
| SIQA | In-Squeeze | Accuracy89.85 | | 96 | 3d ago |
| WinoGrande (val) | | Accuracy73.88 | | 87 | 3d ago |
| StrategyQA (test) | SGE | Accuracy83.49 | | 81 | 3d ago |
| OBQA | HydraLoRA | Accuracy89.2 | | 75 | 2d ago |
| ARC-E | Self-consistency | Accuracy96.4 | | 62 | 2d ago |
| Commonsense Reasoning (BoolQ, PIQA, SIQA, HellaS., WinoG., ARC-e, ARC-c, OBQA) | DoRA | BoolQ Accuracy74.6 | | 61 | 3d ago |
| CommonSenseQA | | BS0.1054 | | 54 | 3d ago |
| CommonsenseQA (val) | Qwen-1.5 14B (Teacher) | Accuracy82.06 | | 52 | 3d ago |
| ARC-C | DIVER | Accuracy91.1 | | 51 | 2d ago |
| PIQA 1.0 (test) | Mistral-7B + Uniform | Accuracy82.21 | | 48 | 3d ago |
| Average 7 Commonsense Reasoning Tasks | | Avg Accuracy72.04 | | 47 | 3d ago |
| CommonsenseQA (CSQA) v1.0 (test) | CKT-base | Accuracy64.11 | | 46 | 3d ago |
| PIQA (test) | UNICORN | Accuracy90.1 | | 46 | 3d ago |