Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

external contest-math suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical Reasoning120-question external contest-math suite (AIME 2024 I, AIME 2024 II, AIME 2025, AIME 2026, HMMT November 2025) AoPS-derived library
Accuracy Change1.88
5
Showing 1 of 1 rows