Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME '25 (@1, @32)
Loading...
32.3
Accuracy @1
AVERAGE
21.172
24.061
26.95
29.839
Apr 1, 2026
Accuracy @1
Accuracy @32
Updated 16d ago
Evaluation Results
Method
Method
Links
Accuracy @1
Accuracy @32
AVERAGE
Base Model=OLMo-3-7B,...
2026.04
32.3
66.7
TSV
Base Model=OLMo-3-7B,...
2026.04
32
63.3
ISO-C
Base Model=OLMo-3-7B,...
2026.04
31.5
53.3
TA
Base Model=OLMo-3-7B,...
2026.04
31.1
66.7
EXPERT
Base Model=OLMo-3-7B,...
2026.04
30.7
66.7
ACTMat
Base Model=OLMo-3-7B,...
2026.04
29.8
66.7
REGMEAN
Base Model=OLMo-3-7B,...
2026.04
27.8
60
ZERO-SHOT
Base Model=OLMo-3-7B,...
2026.04
21.6
53.3
Feedback
Search any
task
Search any
task