| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multimodal Question Answering | MULTIMODALQA (test) | Overall Accuracy75.5 | 17 | |
| Multimodal Retrieval | MULTIMODALQA Doc (test) | Total Time (ms)371 | 10 | |
| Multimodal Question Answering | MultiModalQA (val) | EM65.1 | 10 | |
| Multimodal Information Retrieval | MultimodalQA (val) | R@120.53 | 7 | |
| End-to-end Question Answering | MultimodalQA (test) | EM44.57 | 7 | |
| Multimodal Question Answering | MultiModalQA 95 (test) | Accuracy82.7 | 6 | |
| Retrieval | MultimodalQA | R@369.07 | 6 | |
| Multi-modal Question Answering | MultiModalQA (dev) | F1 Score85.28 | 5 |