Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AMC 23 (Avg@16, #Tokens)
Loading...
93
Average Accuracy @16
GR3
85.512
87.456
89.4
91.344
Mar 11, 2026
Average Accuracy @16
# Tokens
Updated 1mo ago
Evaluation Results
Method
Method
Links
Average Accuracy @16
# Tokens
GR3
Model size=7B
2026.03
93
3,090
DLER–R1–7B
Model size=7B
2026.03
91.4
2,255
GRPO
Model size=7B
2026.03
90.3
7,256
DeepSeek–R1–Distill–7B
Model size=7B
2026.03
89.8
6,385
Laser–DE–L4096–7B
Model size=7B
2026.03
88.1
2,427
AdaptThink–7B
Model size=7B
2026.03
88.1
4,280
LCR1–7B
Model size=7B
2026.03
85.8
2,963
Feedback
Search any
task
Search any
task