| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MMIU | InternVL2.5-78B | Accuracy55.8 | 60 | 2d ago | |
| MuirBench | Score68 | 26 | 2d ago | ||
| MMT-Bench (val) | Qwen2-VL-72B | Score71.8 | 23 | 2d ago | |
| BLINK (val) | Score68 | 23 | 2d ago | ||
| MuirBench (test) | Accuracy68 | 21 | 3d ago | ||
| MMIU 106 (test) | Score72.1 | 19 | 3d ago | ||
| MuirBench 142 (test) | Score86.1 | 19 | 3d ago | ||
| QBench2 | DelimScaling | Accuracy81.7 | 18 | 3d ago | |
| MuirBench Multi-image Understanding | Accuracy62.3 | 17 | 3d ago | ||
| MIRB (test) | Qwen2VL-7B | Accuracy60.8 | 12 | 3d ago | |
| Blink (test) | InternVL2-Llama3-76B | Accuracy56.8 | 12 | 3d ago | |
| BLINK multi-img | Qwen2.5-VL-7B | Accuracy55.6 | 11 | 3d ago | |
| MIBench | Migician | Accuracy71.42 | 11 | 3d ago | |
| MMT (val) | InternVL2-Llama3-76B | Accuracy67.4 | 11 | 3d ago | |
| MMIU (test) | Qwen2VL-7B | Accuracy52.6 | 11 | 3d ago | |
| NLVR2 (test) | attention-masking strategy | Accuracy87.3 | 9 | 3d ago | |
| MEGABench (test) | Accuracy54.2 | 7 | 3d ago | ||
| MMSI-Bench | Gemini 1.5 Pro | Accuracy36.9 | 6 | 3d ago |