Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on MathVista MVistam
Loading...
73.4
Accuracy
Gemini-2.0-Flash
55.72
60.31
64.9
69.49
Jan 1, 2026
Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Accuracy
Gemini-2.0-Flash
Sampling Strategy=avg@...
2026.01
73.4
CPPO
Backbone=Qwen2.5-VL-7B...
2026.01
72.2
PAPO
Backbone=Qwen2.5-VL-7B...
2026.01
71.6
GRPO
Backbone=Qwen2.5-VL-7B...
2026.01
71.2
NoisyRollout
Backbone=Qwen2.5-VL-7B...
2026.01
71.1
OpenVLThinker
Backbone=Qwen2.5-VL-7B...
2026.01
70.7
PerceptionR1
Backbone=Qwen2.5-VL-7B...
2026.01
70
Look-Back
Backbone=Qwen2.5-VL-7B...
2026.01
69.1
Vision-Matters
Backbone=Qwen2.5-VL-7B...
2026.01
68.6
Vision-SR1
Backbone=Qwen2.5-VL-7B...
2026.01
67
CPPO
Backbone=Qwen2.5-VL-3B...
2026.01
66.3
Qwen2.5-VL-7B
Backbone=Qwen2.5-VL-7B...
2026.01
65.6
PAPO
Backbone=Qwen2.5-VL-3B...
2026.01
64.8
GRPO
Backbone=Qwen2.5-VL-3B...
2026.01
63.7
Visionary-R1
Backbone=Qwen2.5-VL-3B...
2026.01
61.4
GPT4-o
Sampling Strategy=avg@...
2026.01
60
OpenVLThinker
Backbone=Qwen2.5-VL-3B...
2026.01
60
Qwen2.5-VL-3B
Backbone=Qwen2.5-VL-3B...
2026.01
56.4
Feedback
Search any
task
Search any
task