Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME '24 (Pass@1, Pass@32)
Loading...
39.9
Pass@1 Accuracy
ACTMat
21.388
26.194
31
35.806
Apr 1, 2026
Pass@1 Accuracy
Pass@32 Accuracy
Updated 16d ago
Evaluation Results
Method
Method
Links
Pass@1 Accuracy
Pass@32 Accuracy
ACTMat
Base Model=OLMo-3-7B,...
2026.04
39.9
80
TSV
Base Model=OLMo-3-7B,...
2026.04
39.8
80
EXPERT
Base Model=OLMo-3-7B,...
2026.04
38.2
76.7
TA
Base Model=OLMo-3-7B,...
2026.04
36.8
73.3
AVERAGE
Base Model=OLMo-3-7B,...
2026.04
35.9
80
ISO-C
Base Model=OLMo-3-7B,...
2026.04
33.4
76.7
REGMEAN
Base Model=OLMo-3-7B,...
2026.04
30.8
73.3
ZERO-SHOT
Base Model=OLMo-3-7B,...
2026.04
22.1
66.7
Feedback
Search any
task
Search any
task