Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BigMath

Benchmarks

Task NameDataset NameSOTA ResultTrend
Math TutoringBigMath In-Domain
Rsol57.4
21
Math ReasoningBigMath level 5
Average Accuracy43.1
18
Math ReasoningBigMath level 4
Accuracy66.3
18
Mathematical ReasoningBigMath (test)
True Accuracy48.4
6
Mathematical ReasoningBigmath Mathematics
Score47.33
3
Showing 5 of 5 rows