| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multi-image Reasoning | MuirBench | Accuracy77.2 | 48 | |
| Multi-image Understanding | MuirBench | Score68 | 26 | |
| Multi-image reasoning | Muirbench (test) | Accuracy68 | 24 | |
| Multi-Image Understanding | MuirBench (test) | Accuracy68 | 21 | |
| Multi-Image Understanding | MuirBench 142 (test) | Score86.1 | 19 | |
| Multi-image Understanding | MuirBench Multi-image Understanding | Accuracy62.3 | 17 | |
| Multimodal Reasoning | MuirBench | Accuracy57.14 | 11 | |
| Procedural Temporal Understanding | MuirBench (test) | Overall Score65.04 | 7 | |
| General Visual Question Answering | MuirBench | Score70.7 | 5 | |
| Comprehensive Multi-image | MuirBench | Accuracy62.3 | 4 | |
| Multi-image Multi-modal Understanding | MuirBench | Accuracy41.8 | 2 |