Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME 25 (Avg@32, #Token)
Loading...
32.8
Avg@32 Score
GR³
20.112
23.406
26.7
29.994
Mar 11, 2026
Avg@32 Score
Token Count
Updated 1mo ago
Evaluation Results
Method
Method
Links
Avg@32 Score
Token Count
GR³
Category=Performance-o...
2026.03
32.8
8,137
GRPO
Category=Performance-o...
2026.03
31.4
12,985
DLER-R1-1.5B
Category=Length-orient...
2026.03
27.8
3,153
AdaptThink-1.5B
Category=Length-orient...
2026.03
24.7
9,234
Laser-DE-L4096-1.5B
Category=Length-orient...
2026.03
24.1
5,008
DeepSeek-R1-Distill-1.5B
Category=Initial model
2026.03
23.6
15,799
LCR1-1.5B
Category=Length-orient...
2026.03
20.6
8,275
Feedback
Search any
task
Search any
task