| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Question Answering | xGQA | Avg_mul Score48.04 | 10 | |
| Visual Question Answering | xGQA (test) | Accuracy (en)56.91 | 6 | |
| Cross-lingual Visual Question Answering | xGQA | EN Accuracy56.68 | 5 | |
| Visual Question Answering | xGQA yes/no (test) | Accuracy (en)53.22 | 4 |