Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DeepMind-Mathematics

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningDeepMind-Mathematics
Accuracy88.4
63
Mathematical ReasoningDeepMind-Mathematics (test)
Accuracy64.1
27
Mathematical ReasoningDeepMind-Mathematics
Pass@187.1
22
Open-form mathematical reasoningDeepMind MATHEMATICS
Exact-match Accuracy42.13
7
Showing 4 of 4 rows