Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Expert-level multidisciplinary QA on MMMU (dev val)
Loading...
52.8
Pass@1
Vision-SR1
47.8704
49.1502
50.43
51.7098
Dec 13, 2025
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
Vision-SR1
Model Category=Reasoni...
2025.12
52.8
R1-ShareVL-7B
Model Category=Reasoni...
2025.12
52.28
PeRL-VL
Base Model=Qwen2.5-VL-...
2025.12
52.22
RL (Conditional Hard Gate)
Training Stage=PeRL: P...
2025.12
52.11
RL (Conditional Soft Gate)
Training Stage=PeRL: P...
2025.12
52.09
VL-Cogito
Model Category=Reasoni...
2025.12
51.72
PeBR-R1-7B
Model Category=Reasoni...
2025.12
51.59
RL (Verifiable Rewards)
Training Stage=PeRL: P...
2025.12
51.42
ThinkLite-VL
Model Category=Reasoni...
2025.12
50.81
RL (Aggregated Rewards)
Training Stage=PeRL: P...
2025.12
50.8
SFT (OpenThought)
Training Stage=PeRL: R...
2025.12
48.51
Qwen2.5-VL-7B
Model Category=Base
2025.12
48.15
SFT (GPT-4o)
Training Stage=PeRL: R...
2025.12
48.06
Feedback
Search any
task
Search any
task