Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME 25 (Pass@1, Pass@16)
Loading...
26.7
Pass@1
APMPO
-1.068
6.141
13.35
20.559
Apr 11, 2026
Pass@1
Pass@16
Updated 27d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@16
APMPO
Backbone=DeepSeek-R1-D...
2026.04
26.7
50
DAPO
Backbone=DeepSeek-R1-D...
2026.04
23.3
50
GMPO
Backbone=DeepSeek-R1-D...
2026.04
23.3
46.7
GMPO
Backbone=Qwen2.5-Math-...
2026.04
20
26.7
GRPO
Backbone=DeepSeek-R1-D...
2026.04
20
43.3
DAPO
Backbone=Qwen2.5-Math-...
2026.04
16.7
23.3
APMPO
Backbone=Qwen2.5-Math-...
2026.04
16.7
26.7
GRPO
Backbone=Qwen2.5-Math-...
2026.04
13.3
16.7
Base
Backbone=DeepSeek-R1-D...
2026.04
13.3
40
DAPO
Backbone=Qwen2.5-3B-In...
2026.04
10
20
GMPO
Backbone=Qwen2.5-3B-In...
2026.04
10
16.7
APMPO
Backbone=Qwen2.5-3B-In...
2026.04
10
20
GRPO
Backbone=Qwen2.5-3B-In...
2026.04
6.7
13.3
Base
Backbone=Qwen2.5-Math-...
2026.04
3.3
10
Base
Backbone=Qwen2.5-3B-In...
2026.04
0
6.7
Feedback
Search any
task
Search any
task