| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| BLINK | THINKLITE-VL | Accuracy75.9 | 241 | 2d ago | |
| MMVP | SCF-VR | Accuracy76.67 | 118 | 7d ago | |
| MME Perception | InternVL2-7B + SCR | MME^P1,742 | 50 | 2mo ago | |
| AI2D | INTERNVL3-8B + PGT | Accuracy84 | 47 | 9d ago | |
| BLINK (val) | Validation Score95.67 | 44 | 1d ago | ||
| V* | Step-GUI-8B | Score89 | 42 | 5d ago | |
| MMStar | Qwen3-VL-8B | Accuracy73.07 | 30 | 14d ago | |
| MME | Self-Aug | Perception Score1,726.77 | 28 | 3mo ago | |
| OCRBench | Score877 | 22 | 16d ago | ||
| V* v1.0 (test) | Score84.35 | 20 | 1mo ago | ||
| MME | Perception Score2,415 | 20 | 2mo ago | ||
| MMBench (P) | MG-LLaVA | Accuracy80.1 | 20 | 3mo ago | |
| Blink 41 (val) | Score87.4 | 19 | 3mo ago | ||
| Relative Reflectance (RR) | Unsilencing Latent Reasoning | Accuracy44.78 | 18 | 29d ago | |
| IQTest | Unsilencing Latent Reasoning | Accuracy32 | 18 | 29d ago | |
| Counting | ICoT | Accuracy68.33 | 18 | 29d ago | |
| SEED-Bench Image | MG-LLaVA | Accuracy73.7 | 18 | 3mo ago | |
| HallusionBench | VL-Rethinker | Accuracy71.08 | 15 | 1mo ago | |
| SeedBench-2-Plus | GPT-4o | Accuracy72 | 15 | 1mo ago | |
| HR-8K (test) | ZwZ-8B | Accuracy82 | 15 | 3mo ago | |
| HR-4K (test) | ZwZ-8B | Accuracy84.4 | 15 | 3mo ago | |
| VStar (test) | ZwZ-4B | Accuracy92.7 | 15 | 3mo ago | |
| MMBench (T) | MG-LLaVA | Accuracy (MMBench T)79.1 | 15 | 3mo ago | |
| CVBench 3D | Score89.1 | 13 | 21d ago | ||
| CVBench 2D | Score72.4 | 13 | 21d ago |