Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PolyMath

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningPolyMath English
Pass@144
9
Multilingual ReasoningPolymath Low
Accuracy (en)96.5
3
Mathematical ReasoningPolymath Low
Accuracy (en)96.8
3
Showing 3 of 3 rows