| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multimodal Understanding | SEED-Bench | Accuracy81.7 | 203 | |
| Multimodal Understanding | SEED-Bench Image | Accuracy78 | 82 | |
| Multimodal Evaluation | SEED-Bench | Accuracy77.01 | 80 | |
| Visual Question Answering | SEED-Bench Image | Accuracy76.9 | 64 | |
| Multi-modal Understanding | SEED-Bench (overall) | Overall Score62.9 | 40 | |
| Video Understanding | SEED-Bench Video Understanding | Accuracy74.12 | 33 | |
| Multimodal Reasoning | SEED-Bench Image | Score74.2 | 32 | |
| Multimodal Understanding | SEED Bench Img | SEEDB Score77 | 32 | |
| Multimodal Question Answering | SEED-Bench | Accuracy (All)71.1 | 21 | |
| Image Understanding | SEED-Bench image | Accuracy76.55 | 20 | |
| Benchmark Compression (Coreset selection) | SEED-Bench-2-Plus (full) | rho0.874 | 20 | |
| Multimodal Understanding | SEED-Bench SEED-I | Accuracy87.7 | 20 | |
| Multimodal Understanding | SEED-Bench Image (test) | Accuracy75.9 | 20 | |
| Visual Perception | SEED-Bench Image | Accuracy73.7 | 18 | |
| Multimodal Understanding | SEED-Bench (val) | Accuracy58.8 | 16 | |
| Multimodal Evaluation | SEED-Bench | SEED-Bench Score66.8 | 15 | |
| Multimodal Understanding | SEED-Bench 1 | Image Accuracy73.5 | 15 | |
| Spatial Understanding | SEED-Bench Spatial | Accuracy66.28 | 15 | |
| Multimodal Understanding | SEED-Bench Image Part | Accuracy75.9 | 15 | |
| Multimodal Reasoning | SEED-BENCH | Accuracy69.9 | 14 | |
| Multi-modal Understanding | SEED-Bench all (val) | Accuracy65.6 | 14 | |
| General Visual Question Answering | SEED-Bench IMG 2023a | Accuracy77 | 13 | |
| Visual Reasoning | SEED-Bench 2-Plus | Accuracy72 | 11 | |
| Multimodal Evaluation | SEED-Bench Image | Accuracy77.39 | 10 | |
| Comprehensive Multimodal Evaluation | SEED-Bench Image | Accuracy77.3 | 10 |