Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multimodal Understanding on Aggregate Audio-Visual & Video Benchmarks

57.6Avg Audio-Visual Score

Full Model (Qwen2.5-Omni-7B)

44.70448.05251.454.748Dec 11, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
57.666100
2025.12
56.258.494.1
2025.12
56.164.1100
2025.12
55.563.599
2025.12
53.261.495.2
2025.12
49.860.491
2025.12
4960.187.6
2025.12
45.257.484.2