| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Question Answering | CODAH (test) | Accuracy85.79 | 24 | |
| Commonsense Sentence Completion | CODAH (test) | Accuracy84.3 | 6 | |
| Commonsense Reasoning | CODAH Synonym Replacement WordNet-based (test) | Accuracy76.2 | 6 | |
| Robustness to TextFooler-based adversarial attacks | CODAH (test) | Failure Rate30.9 | 6 | |
| Scene Completion | X-CODAH | Score (EN)69.9 | 6 |