Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on AIME 2024 (Reasoning Performance)

69.2Reasoning Performance

RealSafe-R1-32B

45.851.87557.9564.025May 21, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.05
69.2
2025.05
68.3
2025.05
65.8
2025.05
51.7
2025.05
50.8
2025.05
46.7