Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Video Multi-modal Understanding on Video-MME
Loading...
63.3
Accuracy (No Subtitles)
Qwen2-VL-7B
57.996
59.373
60.75
62.127
Jan 30, 2026
Accuracy (No Subtitles)
Accuracy (With Subtitles)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy (No Subtitles)
Accuracy (With Subtitles)
Qwen2-VL-7B
Model=Qwen2-VL-7B
2026.01
63.3
69
VisionTrim on Qwen2-VL-7B
Model=Qwen2-VL-7B, Tok...
2026.01
63.1
68.9
VisionTrim on LLaVA-OneVision-7B
Model=LLaVA-OneVision-...
2026.01
59.5
61.4
LLaVA-OneVision-7B
Model=LLaVA-OneVision-...
2026.01
58.2
61.5
Feedback
Search any
task
Search any
task