Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME 2024 (Reasoning Performance)
Loading...
69.2
Reasoning Performance
RealSafe-R1-32B
45.8
51.875
57.95
64.025
May 21, 2025
Reasoning Performance
Updated 1mo ago
Evaluation Results
Method
Method
Links
Reasoning Performance
RealSafe-R1-32B
Model Series=32B Model...
2025.05
69.2
Improved CoT
Model Series=32B Model...
2025.05
68.3
STAR1-R1-Distill-32B
Model Series=32B Model...
2025.05
65.8
Improved CoT
Model Series=7B Models...
2025.05
51.7
STAR1-R1-Distill-7B
Model Series=7B Models...
2025.05
50.8
RealSafe-R1-7B
Model Series=7B Models...
2025.05
46.7
Feedback
Search any
task
Search any
task