Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multimodal Understanding on Aggregate Audio-Visual & Video Benchmarks
Loading...
57.6
Avg Audio-Visual Score
Full Model (Qwen2.5-Omni-7B)
44.704
48.052
51.4
54.748
Dec 11, 2025
Avg Audio-Visual Score
Avg Video Score
Relative Performance
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg Audio-Visual Score
Avg Video Score
Relative Performance
Full Model (Qwen2.5-Omni-7B)
Model Scale=7B, Token...
2025.12
57.6
66
100
EchoingPixels
Model Scale=7B, Token...
2025.12
56.2
58.4
94.1
Full Model (Qwen2.5-Omni-3B)
Model Scale=3B, Token...
2025.12
56.1
64.1
100
EchoingPixels
Model Scale=3B, Token...
2025.12
55.5
63.5
99
EchoingPixels
Model Scale=3B, Token...
2025.12
53.2
61.4
95.2
EchoingPixels
Model Scale=3B, Token...
2025.12
49.8
60.4
91
IntraModal
Model Scale=7B, Token...
2025.12
49
60.1
87.6
IntraModal
Model Scale=3B, Token...
2025.12
45.2
57.4
84.2
Feedback
Search any
task
Search any
task