| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Image Embedding | MMEB v1 (test) | Classification69.1 | 70 | |
| Multimodal Embedding | MMEB | Classification Accuracy76.1 | 56 | |
| Multi-modal Embedding | MMEB 1.0 (test) | Classification Accuracy67.6 | 52 | |
| Multimodal Retrieval | MMEB | Classification Score788.1 | 50 | |
| Multimodal Embedding Evaluation | MMEB V2 (test) | Image CLS Hit@169.8 | 35 | |
| Multimodal Visual Document Retrieval | MMEB Visual Document portion v2 | ViDoRe V1 Score89.44 | 31 | |
| Multimodal Retrieval and Understanding | MMEB V2 (test) | Image CLS Acc76.7 | 27 | |
| Multimodal Retrieval | MMEB Image V2 | CLS Accuracy69.1 | 22 | |
| Multimodal Ranking | MMEB | Classification Score70 | 22 | |
| Multimodal Retrieval | MMEB v1 (test) | Classification61.2 | 18 | |
| Multi-modal Representation Learning | MMEB OOD 1.0 | OOD Precision@159.1 | 18 | |
| Multi-modal Representation Learning | MMEB In-Distribution 1.0 | MMEB IND Precision@171.6 | 18 | |
| Multi-modal Representation Learning | MMEB Overall 1.0 | Classification P@161.6 | 18 | |
| Multimodal Embedding Evaluation | MMEB Overall | Classification Score72.6 | 18 | |
| Retrieval | MMEB v2 | Image Retrieval Score78.2 | 18 | |
| Video Understanding | MMEB Video v2 | Classification Score (CLS)57.8 | 17 | |
| Multimodal Retrieval | MMEB Total v2 | Overall Score68.1 | 15 | |
| Multimodal Retrieval | MMEB Video V2 | CLS Accuracy51.6 | 15 | |
| Image Understanding | MMEB Image v2 | Accuracy (CLS)68.1 | 9 | |
| Zero-shot Image Classification | MMEB (val) | Image Classification Accuracy66.8 | 9 | |
| Multimodal Video Retrieval | MMEB Video portion v2 | K700 Score56.8 | 9 | |
| Video Retrieval | MMEB Video Retrieval (MSRVTT, MSVD, DiDeMo, YouCook2, VATEX) v2 (test) | Retrieval Score43.1 | 8 | |
| Video Classification | MMEB Video Classification (Kinetics-700, SSv2, HMDB, UCF, Breakfast) v2 (test) | Classification Accuracy63.7 | 8 | |
| Universal Multimodal Embedding | MMEB Total v2 | Total Score61.6 | 7 | |
| Video Question Answering | MMEB Video QA v2 (test) | Average Score72.5 | 6 |