Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SEED-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal UnderstandingSEED-Bench
Accuracy81.7
516
Multimodal UnderstandingSEED-Bench Image
Accuracy78
143
Multimodal EvaluationSEED-Bench
Accuracy77.3
112
Visual Question AnsweringSEED-Bench Image
Accuracy76.9
78
Multimodal ReasoningSEED-Bench Image
Score78.6
60
Vision-Language EvaluationSEED-Bench
Accuracy74.74
50
Multi-modal UnderstandingSEED-Bench (overall)
Overall Score62.9
40
Multimodal ReasoningSEED-BENCH
Accuracy69.9
36
Video UnderstandingSEED-Bench Video Understanding
Accuracy74.12
33
Multimodal UnderstandingSEED Bench Img
SEEDB Score77
32
Multimodal EvaluationSEED-Bench 2 Plus
Accuracy71.67
29
Multimodal EvaluationSEED-Bench
SEED-Bench Score66.8
28
Image UnderstandingSEED-Bench image
Accuracy83.1
27
Video ReasoningSeed-Bench R1
Average Answer Score50.5
26
Multi-modal BenchmarkingSEED-Bench
Score60.5
25
Visual UnderstandingSEED-Bench
SEED Score71.8
23
Visual Question AnsweringSEED-Bench
Accuracy94.5
22
Visual Question AnsweringSEED-Bench 2-Plus
Accuracy70.32
21
OCR-related Understanding TasksSEED-Bench-2-Plus
Accuracy76.5
21
Multimodal Question AnsweringSEED-Bench
Accuracy (All)71.1
21
Benchmark Compression (Coreset selection)SEED-Bench-2-Plus (full)
rho0.874
20
Multimodal UnderstandingSEED-Bench SEED-I
Accuracy87.7
20
Multimodal UnderstandingSEED-Bench Image (test)
Accuracy75.9
20
Multimodal Question AnsweringSEED-Bench IMG
Accuracy71.56
18
Multimodal Large Language Model EvaluationSEED-Bench
Accuracy71.56
18
Showing 25 of 67 rows