Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
General-purpose multiple-choice evaluation on MMBench EN V11 (dev)
Loading...
84.3
Pass@1
Vision-SR1
80.972
81.836
82.7
83.564
Dec 13, 2025
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
Vision-SR1
Model Category=Reasoni...
2025.12
84.3
RL (Conditional Hard Gate)
Training Stage=PeRL: P...
2025.12
84.29
RL (Conditional Soft Gate)
Training Stage=PeRL: P...
2025.12
84.15
VL-Cogito
Model Category=Reasoni...
2025.12
83.92
PeBR-R1-7B
Model Category=Reasoni...
2025.12
83.78
R1-ShareVL-7B
Model Category=Reasoni...
2025.12
83.67
PeRL-VL
Base Model=Qwen2.5-VL-...
2025.12
83.5
RL (Verifiable Rewards)
Training Stage=PeRL: P...
2025.12
83.2
RL (Aggregated Rewards)
Training Stage=PeRL: P...
2025.12
83.2
Qwen2.5-VL-7B
Model Category=Base
2025.12
82.9
ThinkLite-VL
Model Category=Reasoni...
2025.12
82.65
SFT (GPT-4o)
Training Stage=PeRL: R...
2025.12
82.57
SFT (OpenThought)
Training Stage=PeRL: R...
2025.12
81.1
Feedback
Search any
task
Search any
task