Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Comprehensive Multimodal Evaluation on Multimodal Evaluation Suite Composite
Loading...
68.7
Overall Score
EVE (Ours-8B-iter4)
66.412
67.006
67.6
68.194
Apr 20, 2026
Overall Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Overall Score
EVE (Ours-8B-iter4)
Model Category=Pseudo-...
2026.04
68.7
VisPlay-8B-iter3
Model Category=Pseudo-...
2026.04
67.7
Jigsaw-R1-8B
Model Category=Templat...
2026.04
67.5
MM-Zero-8B-iter3
Model Category=Pseudo-...
2026.04
66.7
Qwen3-VL-8B-Instruct
Model Category=Open-So...
2026.04
66.5
Feedback
Search any
task
Search any
task