Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on MATH500 (% Avg@4)
Loading...
94.1
Avg@4 (%)
GRPO + RePro
75.692
80.471
85.25
90.029
Dec 1, 2025
Avg@4 (%)
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg@4 (%)
GRPO + RePro
Backbone=Qwen3-1.7B
2025.12
94.1
GRPO
Backbone=Qwen3-1.7B
2025.12
93.4
RF++ B + RePro
Backbone=Qwen3-1.7B
2025.12
92.7
PPO + RePro
Backbone=Qwen3-1.7B
2025.12
92.4
PPO
Backbone=Qwen3-1.7B
2025.12
92.1
Original
Backbone=Qwen3-1.7B
2025.12
91.5
RF++ B
Backbone=Qwen3-1.7B
2025.12
91.5
RF++ B + RePro
Backbone=Hunyuan-1.8B-...
2025.12
86.2
PPO
Backbone=Hunyuan-1.8B-...
2025.12
86
GRPO
Backbone=Hunyuan-1.8B-...
2025.12
85.6
RF++ B
Backbone=Hunyuan-1.8B-...
2025.12
85.5
PPO + RePro
Backbone=Hunyuan-1.8B-...
2025.12
84.3
GRPO + RePro
Backbone=Hunyuan-1.8B-...
2025.12
84.2
PPO + RePro
Backbone=MobileLLM-R1-...
2025.12
83.5
PPO
Backbone=MobileLLM-R1-...
2025.12
81.4
Original
Backbone=Hunyuan-1.8B-...
2025.12
81.3
Original
Backbone=MobileLLM-R1-...
2025.12
76.4
Feedback
Search any
task
Search any
task