| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MM-Vet | GenRecal (InternVL3.5-8B) | MM-Vet Score86.2 | 431 | 4d ago | |
| MMMU (val) | OpenAI-o1 | Accuracy78.2 | 144 | 1mo ago | |
| MMStar | Masters | Accuracy82 | 143 | 18d ago | |
| MMMU | Gemini-2.5 (Pro) | Accuracy83.89 | 130 | 10d ago | |
| WeMath | TGRL-DAPO | Accuracy72.2 | 129 | 4d ago | |
| MMMU Pro | CoT2-Meta | Accuracy85.6 | 107 | 8d ago | |
| MathVision | MiMo-VL-7B | Accuracy57.9 | 102 | 4d ago | |
| LogicVista | RTWI | Accuracy61.7 | 99 | 4d ago | |
| MathVerse | Qwen2.5-VL-72B w/ RAPID | Accuracy56.2 | 84 | 4d ago | |
| MMBench | Qwen2.5-VL | Overall Score88.15 | 78 | 4d ago | |
| MathVista | Qwen3-VL-32B-Thinking | Accuracy85.9 | 72 | 4d ago | |
| M^3CoT | DAP-ICoT | Accuracy58.7 | 70 | 25d ago | |
| DynaMath | SwimBird | Accuracy67.2 | 58 | 25d ago | |
| M3CoT (test) | Total Acc91.61 | 47 | 4d ago | ||
| MMBench (dev) | GPT-4o | Accuracy87.6 | 47 | 1mo ago | |
| HallusionBench | TGRL-DAPO | Accuracy0.7293 | 42 | 19d ago | |
| MathVista | Pass@189.8 | 36 | 25d ago | ||
| MMBench CN | Instruct | Accuracy82 | 36 | 25d ago | |
| MMMU (test) | GPT-4o | Accuracy64.7 | 34 | 1mo ago | |
| SEED-Bench Image | Sphinx | Score74.2 | 32 | 1mo ago | |
| O3-BENCH (test) | INSIGHT-O3 | Chart Score0.756 | 30 | 1mo ago | |
| MMStar | Octopus-8B (Ours) | Accuracy75.2 | 29 | 1mo ago | |
| MathVerse MINI | Qwen3-VL-8B-Thinking | Accuracy77.7 | 25 | 1mo ago | |
| LMMs-Eval Average of 12 benchmarks | InternVL2-8B | Average Accuracy70.83 | 25 | 1mo ago | |
| MMMU-Pro | Std-10 Score55 | 25 | 1mo ago |