Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Math Reasoning Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningMath Reasoning Suite Average
Average Accuracy63
35
Mathematical ReasoningMath Reasoning Suite (Math500, AIME24, AMC, Gaokao, Minerva Math, Olympiad Bench)
Math500 Score93.2
12
Mathematical ReasoningMath Reasoning Suite AIME, AMC, MATH, Minerva, Olympiad (test)
Avg@8 (AIME)21.4
8
Mathematical ReasoningMath Reasoning Suite (MATH, AIME25, AMC, MINERVA, KAOYAN, OLYMPIAD, CN_MATH24)
MATH Accuracy92.2
6
Showing 4 of 4 rows