Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GSM-Plus

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningGSM-Plus
Accuracy73.8
90
Mathematical ReasoningGSM-Plus (test)
Accuracy68.8
50
Mathematical ReasoningGSM+
Accuracy (GSM+)50.2
13
Mathematical ReasoningGSM-Plus (mini)
Accuracy52.8
8
Math Word Problem SolvingGSM+ v1 (test)
Accuracy65.7
6
Showing 5 of 5 rows