| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MME | Jigsaw + CARE | Score2,565.72 | 727 | 1d ago | |
| MM-Vet | Score64 | 196 | 7d ago | ||
| MME | Qwen-2.5-VL-7B + EvoLMM | MME Score2,375.9 | 173 | 23h ago | |
| MMStar | InternVL2.5-78B | Accuracy69.5 | 139 | 19d ago | |
| MMBench CN | Qwen2.5-VL-7B | Accuracy82.37 | 120 | 7d ago | |
| MMBench | MM1 | MMB Score79.7 | 118 | 2d ago | |
| MME | TwigVLM | MME-P Score1,864 | 114 | 20d ago | |
| SEED-Bench | D2Dloc | Accuracy77.3 | 112 | 6d ago | |
| LLaVA-bench in-the-wild | Self-Aug | Score121.88 | 73 | 7d ago | |
| MME | Qwen3-VL-235B | Total Score2,631.7 | 67 | 15d ago | |
| MM-Bench | CoLLaVO | Accuracy83 | 57 | 3mo ago | |
| LLaVA-Bench | LLaVA-v1.6 (7B) w/ STIC | LLaVA-Bench Score79.2 | 48 | 5d ago | |
| MM-Vet v2 | Vero Q3T-8B | Score81.6 | 46 | 19d ago | |
| MME Chinese | Overall Score60.37 | 43 | 2mo ago | ||
| MME English | Score60.21 | 43 | 2mo ago | ||
| LLaVA-Bench-Wild (LLaVA-W) | MoE-LLaVA | Overall Score97.3 | 38 | 7d ago | |
| MMMU | Score57.6 | 36 | 19d ago | ||
| LLaVA Evaluation Suite 7B v1.5 (test) | GQA61.9 | 34 | 5d ago | ||
| SEED-Bench 2 Plus | Accuracy71.67 | 29 | 2mo ago | ||
| Combined Benchmark Suite (GQA, MMB, MME, VQA-T, SQA-I, VQA-v2) | Relative Accuracy100 | 28 | 1mo ago | ||
| SEED-Bench | LLaVA-1.5-7B | SEED-Bench Score66.8 | 28 | 2mo ago | |
| MMB | Score85.31 | 27 | 3mo ago | ||
| MME | Total Score1,851 | 23 | 6d ago | ||
| Vision-Flan Multimodal Suite | VisNec | Relative Score115.8 | 23 | 3mo ago | |
| MME | RCP | Absolute Score1,787.7 | 20 | 1mo ago |