Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Math Reasoning on AIME 25 (Avg@32, Pass@16)
Loading...
31.15
Avg@32 Score
training-time reweighting
4.7132
11.5766
18.44
25.3034
Mar 23, 2026
Avg@32 Score
Pass@16
Updated 25d ago
Evaluation Results
Method
Method
Links
Avg@32 Score
Pass@16
training-time reweighting
Model=Qwen3-8B-Base
2026.03
31.15
55.38
DAPO
Model=Qwen3-8B-Base
2026.03
26.67
46.76
training-time reweighting
Model=Qwen2.5-Math-7B
2026.03
18.54
36.72
DAPO
Model=Qwen2.5-Math-7B
2026.03
17.6
30.45
Base
Model=Qwen2.5-Math-7B
2026.03
6.67
27.84
Base
Model=Qwen3-8B-Base
2026.03
5.73
32.8
Feedback
Search any
task
Search any
task