| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Commonsense Reasoning | PIQA | Accuracy94.9 | 757 | |
| Physical Commonsense Reasoning | PIQA | Accuracy94.9 | 696 | |
| Question Answering | PIQA | Accuracy86.5 | 505 | |
| Physical Interaction Question Answering | PIQA | Accuracy94.9 | 415 | |
| Commonsense Reasoning | PIQA | Accuracy89.99 | 213 | |
| Reasoning | PIQA | Accuracy96.5 | 164 | |
| Physical Commonsense Reasoning | PIQA (val) | Accuracy83 | 118 | |
| Common Sense Reasoning | PIQA | Accuracy91.89 | 100 | |
| Physical Commonsense Reasoning | PIQA | Accuracy (PIQA)81.5 | 99 | |
| Physical Reasoning | PIQA | Accuracy82.5 | 90 | |
| Physical Commonsense Reasoning | PIQA | Accuracy85.91 | 78 | |
| Commonsense reasoning | PIQA 1.0 (test) | Accuracy82.21 | 64 | |
| Multiple Choice Question Answering | PIQA | Accuracy80.5 | 63 | |
| Zero-shot Reasoning | PIQA | PIQA Zero-shot Accuracy80.9 | 62 | |
| Physical Commonsense Reasoning | PIQA (test) | Accuracy90.7 | 59 | |
| Commonsense Reasoning | PIQA (test) | Accuracy90.1 | 57 | |
| Physical Commonsense Reasoning | PIQA | Accuracy7,497 | 56 | |
| Physical Commonsense Reasoning | PIQA | Accuracy82.54 | 45 | |
| Physical Commonsense Reasoning | PiQA | Accuracy76.56 | 45 | |
| Commonsense Reasoning | PIQA | Normalized Accuracy85.47 | 41 | |
| Question Answering | PIQA (test) | Accuracy85 | 40 | |
| Question Answering | PiQA | Accuracy81.77 | 36 | |
| Physical Reasoning | PIQA | Accuracy91.3 | 34 | |
| Zero-shot Accuracy | PIQA | Zero-shot PIQA Accuracy81.5 | 30 | |
| Inactive Attention Head Identification | PIQA | Percentage of Heads Zeroed31.3 | 28 |