Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Computational Reasoning Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Computational ReasoningComputational Reasoning Suite AIME24, AIME25, AMC23, GSM8K, MATH
AIME24 Score20.1
10
Showing 1 of 1 rows