Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME 2024 (Mean@32 accuracy)
Loading...
75.2
Mean@32 Accuracy
HTPO
29.336
41.243
53.15
65.057
May 8, 2026
Mean@32 Accuracy
Updated 22d ago
Evaluation Results
Method
Method
Links
Mean@32 Accuracy
HTPO
Backbone=Qwen3-8B-Inst...
2026.05
75.2
SAPO
Backbone=Qwen3-8B-Inst...
2026.05
74
DAPO
Backbone=Qwen3-8B-Inst...
2026.05
73.2
GSPO
Backbone=Qwen3-8B-Inst...
2026.05
72.9
80/20-Rule
Backbone=Qwen3-8B-Inst...
2026.05
72.9
BAPO
Backbone=Qwen3-8B-Inst...
2026.05
72.8
GRPO†
Backbone=Qwen3-8B-Inst...
2026.05
71.3
HTPO
Backbone=Qwen3-8B-Base
2026.05
41.5
SAPO
Backbone=Qwen3-8B-Base
2026.05
40.3
GSPO
Backbone=Qwen3-8B-Base
2026.05
39.2
BAPO
Backbone=Qwen3-8B-Base
2026.05
35.6
80/20-Rule
Backbone=Qwen3-8B-Base
2026.05
34.3
DAPO
Backbone=Qwen3-8B-Base
2026.05
32.9
GRPO†
Backbone=Qwen3-8B-Base
2026.05
31.1
Feedback
Search any
task
Search any
task