Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on OlympiadBench (Mean@32 accuracy)
Loading...
60.9
Mean@32 Accuracy
HTPO
38.956
44.653
50.35
56.047
May 8, 2026
Mean@32 Accuracy
Updated 22d ago
Evaluation Results
Method
Method
Links
Mean@32 Accuracy
HTPO
Backbone=Qwen3-8B-Inst...
2026.05
60.9
80/20-Rule
Backbone=Qwen3-8B-Inst...
2026.05
60
SAPO
Backbone=Qwen3-8B-Inst...
2026.05
59.9
DAPO
Backbone=Qwen3-8B-Inst...
2026.05
59.3
GSPO
Backbone=Qwen3-8B-Inst...
2026.05
58.4
BAPO
Backbone=Qwen3-8B-Inst...
2026.05
56.9
GRPO†
Backbone=Qwen3-8B-Inst...
2026.05
53.5
HTPO
Backbone=Qwen3-8B-Base
2026.05
47.6
DAPO
Backbone=Qwen3-8B-Base
2026.05
45.7
80/20-Rule
Backbone=Qwen3-8B-Base
2026.05
45.1
SAPO
Backbone=Qwen3-8B-Base
2026.05
44.8
GSPO
Backbone=Qwen3-8B-Base
2026.05
44.6
BAPO
Backbone=Qwen3-8B-Base
2026.05
42.3
GRPO†
Backbone=Qwen3-8B-Base
2026.05
39.8
Feedback
Search any
task
Search any
task