| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MMBench | Qwen3-VL-32B | Accuracy90.6 | 637 | 2d ago | |
| MM-Vet | InternVL3.5-38B | MM-Vet Score82.2 | 531 | 2d ago | |
| MMMU | Accuracy81.8 | 437 | 16d ago | ||
| SEED-Bench | LLaVA-UHD | Accuracy81.7 | 343 | 3d ago | |
| MMStar | InternVL3-8B-Masters | Accuracy82 | 324 | 3d ago | |
| MME | Qwen2-VL-7B | MME Score2,322 | 207 | 1mo ago | |
| SEED | InternVL3-8B-Masters | Accuracy82.6 | 183 | 18d ago | |
| MMBench CN | InternVL2.5-78B | Accuracy88.5 | 174 | 3d ago | |
| MMMU (val) | MMMU Score85.2 | 152 | 4d ago | ||
| MMBench (MMB) | VLsI-7B | Accuracy86.3 | 141 | 5d ago | |
| SEED-Bench Image | LLaVA-OneVision-72B | Accuracy78 | 121 | 3d ago | |
| MM-VET (test) | GPT-4V | Total Score67.6 | 120 | 9d ago | |
| MMMU (test) | Qwen3-VL | MMMU Score69.6 | 112 | 9d ago | |
| SEED-2-Plus | VLSI-2B | Accuracy81.1 | 110 | 1mo ago | |
| LLaVA Evaluation Suite 1.5 | Vanilla | Average Score100 | 95 | 18d ago | |
| POPE | X-Omni | POPE Score0.893 | 90 | 15d ago | |
| MME | InternVL3 | Score2,393 | 83 | 3d ago | |
| MMMU | EMMA | MMMU Score62.5 | 78 | 1mo ago | |
| SEEDBench2 Plus | MIRROR (ours) | Accuracy76.86 | 74 | 1mo ago | |
| LLaVA-Bench | ResDec | Overall Score91.9 | 72 | 3d ago | |
| MMBench Chinese | Qwen3-VL-30B A3B-Instruct | MMB Benchmark (CN)89.5 | 70 | 1mo ago | |
| MMMU | HeadLens | MMMU Score60.74 | 69 | 9d ago | |
| MMBench (test) | MMCTAgent | Accuracy84.2 | 67 | 3d ago | |
| MMBench EN v1.1 | Pixelis (Qwen3-VL-8B-Instruct) | Accuracy89.5 | 63 | 22d ago | |
| MMMU | MMMU Score81.8 | 59 | 2d ago |