Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Combined Benchmark Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Aggregate Model PerformanceCombined Benchmark Suite
Average Score100
57
Multimodal EvaluationCombined Benchmark Suite (GQA, MMB, MME, VQA-T, SQA-I, VQA-v2)
Relative Accuracy100
28
Showing 2 of 2 rows