| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Question Answering | VQA v2 | Accuracy88.1 | 1,362 | |
| Visual Question Answering | VQA v2 (test-dev) | Overall Accuracy87.66 | 706 | |
| Visual Question Answering | VQA v2 (test-std) | Accuracy86.1 | 486 | |
| Visual Question Answering | VQA 2.0 (test-dev) | Accuracy86.5 | 337 | |
| Visual Question Answering | VQAv2 | Accuracy83.1 | 177 | |
| Visual Question Answering | VQA (test-dev) | Acc (All)78.25 | 147 | |
| Visual Question Answering | VQA v2 (val) | Accuracy95.06 | 144 | |
| Visual Question Answering | VQA 2.0 (val) | Accuracy (Overall)76.5 | 143 | |
| Visual Question Answering | VQA v2 (test) | Accuracy86.1 | 142 | |
| Visual Question Answering | VQA (test-std) | Accuracy84 | 120 | |
| Visual Question Answering | VQA v2 | Accuracy81.8 | 101 | |
| Open-Ended Visual Question Answering | VQA 1.0 (test-dev) | Overall Accuracy66.7 | 100 | |
| Visual Question Answering | VQAv2 (test) | VQA Accuracy83.4 | 82 | |
| Visual Question Answering | VQAv2 (test-dev) | Accuracy86.1 | 80 | |
| Visual Question Answering | VQA v2 | Accuracy80.1 | 71 | |
| Visual Question Answering (Multiple-choice) | VQA 1.0 (test-dev) | Accuracy (All)70.04 | 66 | |
| Visual Question Answering | VQA text | Accuracy83.2 | 61 | |
| Visual Question Answering | VQA (val) | Overall Accuracy79.54 | 55 | |
| Visual Question Answering | VQA | Accuracy69.7 | 52 | |
| Open-Ended Visual Question Answering | VQA 1.0 (test-standard) | Overall Accuracy67.36 | 50 | |
| Visual Question Answer | VQA 1.0 (test-dev) | Overall Accuracy67.42 | 44 | |
| Visual Question Answering | VQA v2 | ASR100 | 42 | |
| Visual Question Answering | VQAv2 (test-std) | Accuracy82.3 | 38 | |
| Visual Question Answering | VQA v2 | VQAv2 Accuracy80.4 | 37 | |
| Visual Question Answering | VQA v2 | Accuracy (Clean)74.5 | 37 |