Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Math Reasoning on MathVerse
Loading...
75.2
Pass@1
ADHint
39.2264
48.5657
57.905
67.2443
Dec 15, 2025
Dec 27, 2025
Jan 9, 2026
Jan 22, 2026
Feb 4, 2026
Feb 17, 2026
Mar 2, 2026
Pass@1
Avg@8 Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Pass@1
Avg@8 Score
ADHint
Base Model=Qwen3-VL-8B...
2025.12
75.2
80.1
HintGRPO
Base Model=Qwen3-VL-8B...
2025.12
64.2
68.7
GRPO
Base Model=Qwen3-VL-8B...
2025.12
63.1
79
Trajectory-Hint
2026.03
43.22
-
Auxiliary Reward
2026.03
42.92
-
Ctrl-R
Power scaling factor (...
2026.03
42.72
-
OpenVLThinker
2026.03
41.42
-
GRPO
2026.03
40.61
-
Feedback
Search any
task
Search any
task