Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Math Reasoning on DynaMath (Pass@1)
Loading...
79
Pass@1
SFT + AXPO
52.272
59.211
66.15
73.089
May 27, 2026
Pass@1
Updated 6d ago
Evaluation Results
Method
Method
Links
Pass@1
SFT + AXPO
Base Model=Qwen3-VL-8B...
2026.05
79
Qwen3-VL-8B-Thinking (Agent)
Base Model=Qwen3-VL-8B...
2026.05
75.9
PyVision-RL
Base Model=Qwen2.5-VL-7B
2026.05
61.6
DeepEyes-v2
Base Model=Qwen2.5-VL-7B
2026.05
57.2
DeepEyes
Base Model=Qwen2.5-VL-7B
2026.05
55
Qwen2.5-VL-7B-Instruct
Base Model=Qwen2.5-VL-7B
2026.05
53.3
Feedback
Search any
task
Search any
task