Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Math Domain

Benchmarks

Task NameDataset NameSOTA ResultTrend
Math problem solvingMath Domain (AIME24, Math-OAI, Minerva, Olympiad, ACM23) Qwen2.5-7B (10% selection)
AIME24 Score7.71
18
Mathematical Problem SolvingMath Domain (Out-of-Domain: MATH500, AIME24, Minerva-Math, AMC23)
MATH500 Score91.8
11
Mathematical ReasoningMath Domain In-Domain
MATH50091
11
Mathematical ReasoningMath Domain
Avg Accuracy66.45
7
Showing 4 of 4 rows