Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reasoning on MATH500 (Average Score)
Loading...
96.4
MATH500 Average Score
Qwen3-8B (thinking)
86.416
89.008
91.6
94.192
Dec 1, 2025
MATH500 Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
MATH500 Average Score
Qwen3-8B (thinking)
Base Model=Qwen3-8B, T...
2025.12
96.4
REINFORCE++ (Ours)
Base Model=Qwen3-8B, T...
2025.12
96.4
Qwen3-8B + CPO
Base Model=Qwen3-8B, T...
2025.12
95.55
Qwen3-8B + SFT (STAR-1)
Base Model=Qwen3-8B, T...
2025.12
95.5
Qwen3-8B + SFT (SafeChain)
Base Model=Qwen3-8B, T...
2025.12
95
REINFORCE++ (Ours)
Base Model=DeepSeek-R1...
2025.12
92.3
DeepSeek-R1-Distill-Qwen-7B
Base Model=DeepSeek-R1...
2025.12
92
DeepSeek-R1-Distill-Qwen-7B + SFT (STAR-1)
Base Model=DeepSeek-R1...
2025.12
91.8
Qwen3-8B + SFT (R2D-R1)
Base Model=Qwen3-8B, T...
2025.12
91.7
DeepSeek-R1-Distill-Qwen-7B + SFT (SafeChain)
Base Model=DeepSeek-R1...
2025.12
91.05
DeepSeek-R1-Distill-Qwen-7B + CPO
Base Model=DeepSeek-R1...
2025.12
90.75
DeepSeek-R1-Distill-Qwen-7B + SFT (R2D-R1)
Base Model=DeepSeek-R1...
2025.12
86.8
Feedback
Search any
task
Search any
task