Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Math Reasoning on MathVision (Pass@1)
Loading...
56.1
Pass@1
SFT + AXPO
24.38
32.615
40.85
49.085
May 27, 2026
Pass@1
Updated 6d ago
Evaluation Results
Method
Method
Links
Pass@1
SFT + AXPO
Base Model=Qwen3-VL-8B...
2026.05
56.1
Qwen3-VL-8B-Thinking (Agent)
Base Model=Qwen3-VL-8B...
2026.05
47.1
DeepEyes-v2
Base Model=Qwen2.5-VL-7B
2026.05
28.9
PyVision-RL
Base Model=Qwen2.5-VL-7B
2026.05
28.7
Thyme
Base Model=Qwen2.5-VL-7B
2026.05
27.6
DeepEyes
Base Model=Qwen2.5-VL-7B
2026.05
26.6
Qwen2.5-VL-7B-Instruct
Base Model=Qwen2.5-VL-7B
2026.05
25.6
Feedback
Search any
task
Search any
task