| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Question Answering | OpenBookQA | Accuracy94.4 | 465 | |
| Question Answering | OpenBookQA | Accuracy96.07 | 305 | |
| Question Answering | OpenBookQA | Accuracy84.4 | 145 | |
| Question Answering | OpenBookQA (OBQA) (test) | OBQA Accuracy92.4 | 130 | |
| Question Answering | OpenBookQA | Accuracy95.2 | 119 | |
| Commonsense Reasoning | OpenBookQA | Accuracy91.2 | 108 | |
| Question Answering | OpenBookQA | Normalized Accuracy55.6 | 102 | |
| Reasoning | OpenBookQA | Accuracy88.4 | 92 | |
| Science Question Answering | OpenBookQA | Accuracy94.6 | 82 | |
| Multiple-choice Question Answering | OpenBookQA (test) | Accuracy91 | 61 | |
| Open-book Question Answering | OpenBookQA 1.0 (test) | Accuracy35 | 33 | |
| Zero-shot Reasoning | OpenbookQA | Accuracy44 | 26 | |
| Multiple Choice Question Answering | OpenBookQA | Accuracy44.4 | 26 | |
| Question Answering | OpenBook-QA | Accuracy91.6 | 24 | |
| Question Answering | OpenbookQA | Accuracy87.6 | 23 | |
| Question Answering | OpenbookQA (OQA) (val) | Accuracy36.6 | 22 | |
| Question Answering | OpenBookQA (dev) | Accuracy90 | 22 | |
| Common Sense | OpenBookQA | Accuracy81.8 | 21 | |
| Question Answering | OpenBookQA | Composite Score92.14 | 20 | |
| Question Answering | OpenBookQA | Attack Success Rate (ASR)100 | 20 | |
| Question Answering | OpenBookQA | Accuracy85.1 | 17 | |
| Question Answering | OpenBookQA | OpQA Score47 | 15 | |
| Question Answering | OpenBookQA | Mean Per-Step Regret0.157 | 15 | |
| Question Answering | OpenBookQA published (test) | Accuracy65.4 | 15 | |
| Commonsense Reasoning | OpenBookQA | Accuracy (Inter-layer)75.6 | 15 |