Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PYMATH

Benchmarks

Task NameDataset NameSOTA ResultTrend
Tool-augmented ReasoningPYMATH (test)
Final Accuracy71.9
14
Showing 1 of 1 rows