Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on DynaMath DMath
Loading...
56.9
Accuracy
CPPO
32.772
39.036
45.3
51.564
Jan 1, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
CPPO
Backbone=Qwen2.5-VL-7B...
2026.01
56.9
NoisyRollout
Backbone=Qwen2.5-VL-7B...
2026.01
55.9
PerceptionR1
Backbone=Qwen2.5-VL-7B...
2026.01
55.8
GRPO
Backbone=Qwen2.5-VL-7B...
2026.01
55.6
PAPO
Backbone=Qwen2.5-VL-7B...
2026.01
54.7
Vision-Matters
Backbone=Qwen2.5-VL-7B...
2026.01
54.5
Qwen2.5-VL-7B
Backbone=Qwen2.5-VL-7B...
2026.01
53.2
Vision-SR1
Backbone=Qwen2.5-VL-7B...
2026.01
52.6
Look-Back
Backbone=Qwen2.5-VL-7B...
2026.01
52.5
CPPO
Backbone=Qwen2.5-VL-3B...
2026.01
48.9
GRPO
Backbone=Qwen2.5-VL-3B...
2026.01
45.7
PAPO
Backbone=Qwen2.5-VL-3B...
2026.01
45.4
OpenVLThinker
Backbone=Qwen2.5-VL-7B...
2026.01
43.9
Gemini-2.0-Flash
Sampling Strategy=avg@...
2026.01
42.1
Visionary-R1
Backbone=Qwen2.5-VL-3B...
2026.01
41.2
OpenVLThinker
Backbone=Qwen2.5-VL-3B...
2026.01
35.6
GPT4-o
Sampling Strategy=avg@...
2026.01
34.5
Qwen2.5-VL-3B
Backbone=Qwen2.5-VL-3B...
2026.01
33.7
Feedback
Search any
task
Search any
task