| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Textual Explanation Generation | VQA-X (test) | BLEU-426.5 | 11 | |
| VQA Natural Language Explanation | VQA-X filtered (test) | BLEU-164.7 | 8 | |
| Visual Question Answering | VQA-X e-ViL (test) | Human Evaluation Score89.9 | 7 | |
| Multimodal Explanation | VQA-X | F1 Score94.11 | 6 | |
| Visual Explanation Generation | VQA-X (test) | EMD2.41 | 6 |