| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MMIU | InternVL2.5-78B | Accuracy55.8 | 65 | 4d ago | |
| QBench2 | DelimScaling | Accuracy81.7 | 30 | 1mo ago | |
| MuirBench | Score68 | 26 | 1mo ago | ||
| MMT-Bench (val) | Qwen2-VL-72B | Score71.8 | 23 | 1mo ago | |
| BLINK (val) | Score68 | 23 | 1mo ago | ||
| MIBench | Qwen2.5-VL | Accuracy72.42 | 22 | 1mo ago | |
| MileBench (test) | FlashCache | Temporal Multi-Image Score (Task T)57.3 | 21 | 1mo ago | |
| MuirBench (test) | Accuracy68 | 21 | 1mo ago | ||
| MMIU 106 (test) | Score72.1 | 19 | 1mo ago | ||
| MuirBench 142 (test) | Score86.1 | 19 | 1mo ago | ||
| MuirBench Multi-image Understanding | Accuracy62.3 | 17 | 1mo ago | ||
| MIRB (test) | Qwen2VL-7B | Accuracy60.8 | 12 | 1mo ago | |
| Blink (test) | InternVL2-Llama3-76B | Accuracy56.8 | 12 | 1mo ago | |
| BLINK multi-img | Qwen2.5-VL-7B | Accuracy55.6 | 11 | 1mo ago | |
| MMT (val) | InternVL2-Llama3-76B | Accuracy67.4 | 11 | 1mo ago | |
| MMIU (test) | Qwen2VL-7B | Accuracy52.6 | 11 | 1mo ago | |
| NLVR2 (test) | attention-masking strategy | Accuracy87.3 | 9 | 1mo ago | |
| MEGABench (test) | Accuracy54.2 | 7 | 1mo ago | ||
| MMSI-Bench | Gemini 1.5 Pro | Accuracy36.9 | 6 | 1mo ago | |
| Blink | Ours | Accuracy61.18 | 5 | 4d ago |