| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MME | Jigsaw + CARE | Score2,565.72 | 658 | 10d ago | |
| MM-Vet | Score64 | 180 | 3d ago | ||
| MMBench | MM1 | MMB Score79.7 | 118 | 1mo ago | |
| SEED-Bench | Jigsaw | Accuracy77.01 | 95 | 1mo ago | |
| MMBench CN | mPlug-Owl3 | Accuracy74.3 | 83 | 10d ago | |
| MME | TwigVLM | MME-P Score1,864 | 73 | 5d ago | |
| MME | Qwen-2.5-VL-7B + EvoLMM | MME Score2,375.9 | 73 | 3d ago | |
| MMStar | InternVL2.5-78B | Accuracy69.5 | 70 | 1mo ago | |
| MM-Bench | CoLLaVO | Accuracy83 | 57 | 1mo ago | |
| LLaVA-bench in-the-wild | Self-Aug | Score121.88 | 56 | 1mo ago | |
| MME Chinese | Overall Score60.37 | 43 | 1mo ago | ||
| MME English | Score60.21 | 43 | 1mo ago | ||
| LLaVA-Bench | LLaVA-v1.6 (7B) w/ STIC | LLaVA-Bench Score79.2 | 38 | 1mo ago | |
| SEED-Bench 2 Plus | Accuracy71.67 | 29 | 1mo ago | ||
| Combined Benchmark Suite (GQA, MMB, MME, VQA-T, SQA-I, VQA-v2) | Relative Accuracy100 | 28 | 5d ago | ||
| SEED-Bench | LLaVA-1.5-7B | SEED-Bench Score66.8 | 28 | 1mo ago | |
| MMB | Score85.31 | 27 | 1mo ago | ||
| LLaVA-Bench-Wild (LLaVA-W) | MoE-LLaVA | Overall Score97.3 | 24 | 1mo ago | |
| Vision-Flan Multimodal Suite | VisNec | Relative Score115.8 | 23 | 1mo ago | |
| MME | RCP | Absolute Score1,787.7 | 20 | 10d ago | |
| MME | Total Score1,513.8 | 16 | 1mo ago | ||
| MME-RealWorld | ACE-Brain-0-8B | Accuracy71.2 | 15 | 1mo ago | |
| SEED Image | Accuracy77.1 | 15 | 1mo ago | ||
| MME | TwigVLM | Perception Score1,864 | 14 | 5d ago | |
| MME-P | BAGEL | MME-P Score1,687 | 14 | 1mo ago |