Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Math Reasoning on AIME 2024 (Pass@1, AES)
Loading...
39.79
Pass@1
SAS
25.0636
28.8868
32.71
36.5332
Apr 27, 2026
Pass@1
AES
Updated 1mo ago
Evaluation Results
Method
Method
Links
Pass@1
AES
SAS
#Tok=4876
2026.04
39.79
-
GRPO-4K
#Tok=5282
2026.04
38.75
-
ThinkPrune-4k
#Tok=5468
2026.04
36.04
-
LAPO-I
#Tok=5627
2026.04
34.58
-
DeepScaleR
#Tok=6755
2026.04
33.75
-
L1-Max
#Tok=2158
2026.04
25.63
-
Feedback
Search any
task
Search any
task