Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Math Reasoning on OlympiadBench (Pass@16)
Loading...
65.78
Pass@16
GEB-1/π
9.7448
24.2924
38.84
53.3876
Sep 27, 2025
Oct 28, 2025
Nov 28, 2025
Dec 29, 2025
Jan 29, 2026
Mar 1, 2026
Apr 1, 2026
Pass@16
Updated 16d ago
Evaluation Results
Method
Method
Links
Pass@16
GEB-1/π
Backbone=Qwen2.5-7B
2025.09
65.78
GEB-π
Backbone=Qwen2.5-7B
2025.09
64.59
GEB-arctanhπ
Backbone=Qwen2.5-7B
2025.09
64.59
DPO
Backbone=Qwen2.5-7B
2025.09
57.78
HiLL
Backbone=Qwen2.5-7B-In...
2026.04
48.1
SAGE
Backbone=Qwen2.5-7B-In...
2026.04
45.9
HiLLw/o TW
Backbone=Qwen2.5-7B-In...
2026.04
45.7
GRPO
Backbone=Qwen2.5-7B-In...
2026.04
44.5
LUFFY
Backbone=Qwen2.5-7B-In...
2026.04
44.2
Scaf-GRPO
Backbone=Qwen2.5-7B-In...
2026.04
42
Base
Backbone=Qwen2.5-7B-In...
2026.04
39.2
HiLL
Backbone=Llama-3.2-3B-...
2026.04
23.2
HiLLw/o TW
Backbone=Llama-3.2-3B-...
2026.04
22.4
SAGE
Backbone=Llama-3.2-3B-...
2026.04
22
GRPO
Backbone=Llama-3.2-3B-...
2026.04
21.8
Scaf-GRPO
Backbone=Llama-3.2-3B-...
2026.04
19.5
Base
Backbone=Llama-3.2-3B-...
2026.04
14.2
LUFFY
Backbone=Llama-3.2-3B-...
2026.04
11.9
Feedback
Search any
task
Search any
task