Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Competition Mathematics on AIME 2025 (Accuracy avg@4)
Loading...
30.7
Accuracy (avg@4)
PAPO
-0.396
7.677
15.75
23.823
Mar 27, 2026
Accuracy (avg@4)
Updated 20d ago
Evaluation Results
Method
Method
Links
Accuracy (avg@4)
PAPO
Model=Qwen3-4B-Base
2026.03
30.7
ORM(DAPO)
Model=Qwen3-4B-Base
2026.03
26
PAPO
Model=Qwen2.5-14B
2026.03
17.8
ORM(GRPO)
Model=Qwen2.5-14B
2026.03
13.8
PAPO
Model=Qwen2.5-7B
2026.03
13.1
ORM(GRPO)
Model=Qwen2.5-7B
2026.03
10.8
Base
Model=Qwen2.5-14B
2026.03
8.3
Base
Model=Qwen2.5-7B
2026.03
7.5
Base
Model=Qwen3-4B-Base
2026.03
5
ORM(GRPO)
Model=Qwen2.5-3B
2026.03
4.2
PAPO
Model=Qwen2.5-3B
2026.03
3.3
Base
Model=Qwen2.5-3B
2026.03
0.8
Feedback
Search any
task
Search any
task