Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical and General Reasoning Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Comprehensive ReasoningMathematical and General Reasoning Suite Combined
Overall Accuracy46.7
16
Showing 1 of 1 rows