| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Question Answering | VQA v2 | Accuracy88.1 | 1,165 | |
| Visual Question Answering | VQA v2 (test-dev) | Overall Accuracy86 | 664 | |
| Visual Question Answering | VQA v2 (test-std) | Accuracy86.1 | 466 | |
| Visual Question Answering | VQA 2.0 (test-dev) | Accuracy86.5 | 337 | |
| Visual Question Answering | VQAv2 | Accuracy83.1 | 177 | |
| Visual Question Answering | VQA (test-dev) | Acc (All)78.25 | 147 | |
| Visual Question Answering | VQA 2.0 (val) | Accuracy (Overall)76.5 | 143 | |
| Visual Question Answering | VQA v2 (test) | Accuracy86.1 | 131 | |
| Visual Question Answering | VQA (test-std) | Accuracy84 | 110 | |
| Open-Ended Visual Question Answering | VQA 1.0 (test-dev) | Overall Accuracy66.7 | 100 | |
| Visual Question Answering | VQA v2 (val) | Accuracy86.1 | 99 | |
| Visual Question Answering | VQAv2 (test-dev) | Accuracy86.1 | 76 | |
| Visual Question Answering | VQAv2 (test) | VQA Accuracy79.4 | 72 | |
| Visual Question Answering (Multiple-choice) | VQA 1.0 (test-dev) | Accuracy (All)70.04 | 66 | |
| Visual Question Answering | VQA (val) | Overall Accuracy79.54 | 55 | |
| Open-Ended Visual Question Answering | VQA 1.0 (test-standard) | Overall Accuracy67.36 | 50 | |
| Visual Question Answering | VQA text | Accuracy82.2 | 48 | |
| Visual Question Answer | VQA 1.0 (test-dev) | Overall Accuracy67.42 | 44 | |
| Visual Question Answering | VQA v2 | Accuracy (Clean)74.5 | 37 | |
| Visual Question Answering | VQA v2 | Accuracy79.01 | 36 | |
| Visual Question Answering | VQAv2 | Accuracy54.1 | 36 | |
| Open-ended Visual Question Answering | VQA (test-standard) | Accuracy (Overall)83.3 | 32 | |
| Visual Question Answering | VQA v2 (std) | Accuracy84.3 | 31 | |
| Visual Question Answering | VQAv2 (test-std) | Accuracy82.3 | 30 | |
| Visual Question Answering | VQA v2 (dev) | Accuracy84.3 | 30 |