| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Commonsense Validation and Explanation | ComVE (test) | Accuracy98.5 | 13 | |
| Natural Language Explanation Generation | ComVE | Human Evaluation Score70 | 7 | |
| Natural Language Explanation Generation | ComVE few-shot 60-shot | Accuracy68.03 | 3 | |
| Commonsense Validation and Explanation | ComVE | Performance (F+B -> P+B)0.88 | 2 |