| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multimodal Understanding | SEED-Bench | Accuracy81.7 | 343 | |
| Multimodal Understanding | SEED-Bench Image | Accuracy78 | 121 | |
| Multimodal Evaluation | SEED-Bench | Accuracy77.01 | 95 | |
| Visual Question Answering | SEED-Bench Image | Accuracy76.9 | 64 | |
| Multi-modal Understanding | SEED-Bench (overall) | Overall Score62.9 | 40 | |
| Vision-Language Evaluation | SEED-Bench | Accuracy74.74 | 34 | |
| Video Understanding | SEED-Bench Video Understanding | Accuracy74.12 | 33 | |
| Multimodal Reasoning | SEED-Bench Image | Score74.2 | 32 | |
| Multimodal Understanding | SEED Bench Img | SEEDB Score77 | 32 | |
| Multimodal Evaluation | SEED-Bench 2 Plus | Accuracy71.67 | 29 | |
| Multimodal Evaluation | SEED-Bench | SEED-Bench Score66.8 | 28 | |
| Image Understanding | SEED-Bench image | Accuracy83.1 | 27 | |
| Video Reasoning | Seed-Bench R1 | Average Answer Score50.5 | 26 | |
| Multi-modal Benchmarking | SEED-Bench | Score60.5 | 25 | |
| Visual Understanding | SEED-Bench | SEED Score71.8 | 23 | |
| Multimodal Question Answering | SEED-Bench | Accuracy (All)71.1 | 21 | |
| Benchmark Compression (Coreset selection) | SEED-Bench-2-Plus (full) | rho0.874 | 20 | |
| Multimodal Understanding | SEED-Bench SEED-I | Accuracy87.7 | 20 | |
| Multimodal Understanding | SEED-Bench Image (test) | Accuracy75.9 | 20 | |
| Visual Perception | SEED-Bench Image | Accuracy73.7 | 18 | |
| Video Reasoning | SEED-Bench L3 OOD R1 | Accuracy49.3 | 16 | |
| Video Reasoning | SEED-Bench L2 OOD R1 | Accuracy51.6 | 16 | |
| Video Reasoning | SEED-Bench-R1 L1 In-Dist. | Accuracy50.5 | 16 | |
| Multimodal Understanding | SEED-Bench (val) | Accuracy58.8 | 16 | |
| Multimodal Understanding | SEED-Bench 1 | Image Accuracy73.5 | 15 |