Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multimodal Benchmarking on MMMU
Loading...
78.1
Accuracy
Qwen3-VL-32B-Thinking
23.5
37.675
51.85
66.025
Jul 23, 2024
Oct 22, 2024
Jan 21, 2025
Apr 23, 2025
Jul 23, 2025
Oct 22, 2025
Jan 22, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen3-VL-32B-Thinking
Parameters=32B, Thinki...
2026.01
78.1
Qwen3-VL-8B-Thinking
Parameters=8B, Thinkin...
2026.01
74.1
EvoCUA-32B
Parameters=32B
2026.01
68.11
EvoCUA-8B
Parameters=8B
2026.01
62.11
OpenCUA-72B
Parameters=72B
2026.01
60.67
EvoCUA-OpenCUA-72B
Base Model=OpenCUA-72B
2026.01
59.22
INF-LLaVA*
Source=Ours, larger_da...
2024.07
37.2
INF-LLaVA
Source=Ours
2024.07
37
LLaVA1.5
Source=CVPR’24
2024.07
36.4
ConvLLaVA
Source=arXiv’24
2024.07
35.8
DeepStack-L-HD
Source=arXiv’24
2024.07
35.6
AnyGPT
Source=arXiv’24
2024.07
30.6
GenLLaVA
Source=arXiv’24
2024.07
29.7
GILL
Source=NeurIPS’23
2024.07
28.8
MGIE
Source=ICLR’24
2024.07
25.6
Feedback
Search any
task
Search any
task