Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AMC 23 (Pass@1, Pass@16)
Loading...
65
Pass@1
APMPO
33.8
41.9
50
58.1
Apr 11, 2026
Pass@1
Pass@16
Updated 27d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@16
APMPO
Backbone=DeepSeek-R1-D...
2026.04
65
92.5
APMPO
Backbone=Qwen2.5-Math-...
2026.04
62.5
85
GMPO
Backbone=DeepSeek-R1-D...
2026.04
62.5
87.5
DAPO
Backbone=DeepSeek-R1-D...
2026.04
60
90
DAPO
Backbone=Qwen2.5-Math-...
2026.04
57.5
80
GRPO
Backbone=DeepSeek-R1-D...
2026.04
57.5
82.5
GMPO
Backbone=Qwen2.5-Math-...
2026.04
55
82.5
GRPO
Backbone=Qwen2.5-Math-...
2026.04
52.5
75
Base
Backbone=Qwen2.5-Math-...
2026.04
47.5
70
Base
Backbone=DeepSeek-R1-D...
2026.04
47.5
70
DAPO
Backbone=Qwen2.5-3B-In...
2026.04
45
65
APMPO
Backbone=Qwen2.5-3B-In...
2026.04
45
70
GMPO
Backbone=Qwen2.5-3B-In...
2026.04
42.5
60
GRPO
Backbone=Qwen2.5-3B-In...
2026.04
40
60
Base
Backbone=Qwen2.5-3B-In...
2026.04
35
55
Feedback
Search any
task
Search any
task