Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME 2024 (Pass@1 and Token Length)
Loading...
79.8
Pass@1 Accuracy
DeepSeek-R1
69.608
72.254
74.9
77.546
Mar 6, 2025
Pass@1 Accuracy
Average Output Token Length
Updated 1mo ago
Evaluation Results
Method
Method
Links
Pass@1 Accuracy
Average Output Token Length
DeepSeek-R1
2025.03
79.8
9.6
TinyR1-32B-Preview
Parameters=32B
2025.03
78.1
11.8
DeepSeek-R1-Distill-Qwen-32B
Parameters=32B, Backbo...
2025.03
72.6
9.6
DeepSeek-R1-Distill-Llama-70B
Parameters=70B, Backbo...
2025.03
70
-
Feedback
Search any
task
Search any
task