Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on HMMT 25 (pass@1, pass@64)
Loading...
9
Pass@1
CurveRL
-0.152
2.224
4.6
6.976
May 8, 2026
May 10, 2026
May 13, 2026
May 15, 2026
May 18, 2026
May 20, 2026
May 23, 2026
Pass@1
Pass@64
Updated 8d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@64
CurveRL
Backbone=Qwen3-4B-Base
2026.05
9
32.4
MaxRL
Backbone=Qwen3-4B-Base
2026.05
7.9
27.8
GRPO
Backbone=Qwen3-4B-Base
2026.05
5.8
19
Zero-Shot
n=N/A
2026.05
4.16
28.06
DPólya
n=8
2026.05
3.68
26.62
DPólya
n=32
2026.05
3.61
25.55
DPólya
n=16
2026.05
3.6
26.52
DSTaR
n=N/A
2026.05
2
28.84
CurveRL
Backbone=Qwen3-1.7B-Base
2026.05
1.7
19.3
DPólya
n=4
2026.05
1.39
26.02
MaxRL
Backbone=Qwen3-1.7B-Base
2026.05
1.3
18.4
Base
Backbone=Qwen3-4B-Base
2026.05
1
22.5
GRPO
Backbone=Qwen3-1.7B-Base
2026.05
0.5
10
Base
Backbone=Qwen3-1.7B-Base
2026.05
0.2
10.6
Feedback
Search any
task
Search any
task