| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Natural Language Explanation Generation | ECQA | Human Evaluation Score73.33 | 7 | |
| Commonsense Question Answering | ECQA (test) | Accuracy79.7 | 7 | |
| Explanation Generation | ECQA (out-domain) | Grammar Score2.99 | 7 | |
| Natural Language Explanation Generation | ECQA (test) | Accuracy59.4 | 6 | |
| Explanation Generation | ECQA complete (test) | BERTScore87.67 | 6 | |
| Open-Label QA | ECQA | COS-E0.398 | 4 | |
| Commonsense Reasoning | ECQA | Pass@10.7612 | 3 | |
| Natural Language Explanation Generation | ECQA few-shot 60-shot | Accuracy24.53 | 3 | |
| Commonsense Question Answering | ECQA | Performance Score (Finetune Baseline vs Predict Baseline)57.2 | 2 |