| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MME | Score2,312 | 39 | 3d ago | ||
| Image Understanding Suite (TextVQA, ChartQA, MMStar, MMBench, MMVet, MME, RealWorldQA, COCO) | InternVL-3.5-30B-A3B-HF | TextVQA Score85.76 | 34 | 3d ago | |
| SEED-Bench image | LFM2-VL-3B | Accuracy76.55 | 20 | 3d ago | |
| COCO | Score69.3 | 16 | 3d ago | ||
| MMBench | Score83.81 | 16 | 3d ago | ||
| MMStar | Score62.49 | 16 | 3d ago | ||
| TextVQA | Accuracy85.76 | 16 | 3d ago | ||
| MMBench CN | Accuracy58.1 | 13 | 3d ago | ||
| MME-P | BAGEL-7B | MME-P Score1,687 | 11 | 3d ago | |
| MMBench v1.1 (test) | BAGEL | MMB^i Score85 | 10 | 3d ago | |
| MMIU | VideoChat-TPO | MMIU Score40.2 | 7 | 3d ago | |
| SEED-Bench 2 | VideoChat-TPO | SEED-2 Image Score67.3 | 6 | 3d ago | |
| Jetson Orin Nano Performance Benchmark | Mobile-O-0.5B | Vision Encoding Time (ms)88 | 4 | 3d ago | |
| MacBook M2 Pro | Mobile-O-0.5B | Vision Encoding Time (ms)56 | 4 | 3d ago | |
| COCO GPT-based evaluation | Chat-UniVi | Conversation Score84.1 | 4 | 3d ago | |
| MathVerse | Sparse-LaViDa | Accuracy37.9 | 2 | 3d ago | |
| MathVista | LaViDa-O | Accuracy56.9 | 2 | 3d ago | |
| DocVQA | Sparse-LaViDa | Accuracy75.7 | 2 | 3d ago | |
| MMB | LaViDa-O | Accuracy76.4 | 2 | 3d ago | |
| iPhone 17 Pro Performance Benchmark | Mobile-O-0.5B | Vision Encoding Time (ms)102 | 1 | 3d ago |