Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AIME and HMMT

Benchmarks

Task NameDataset NameSOTA ResultTrend
Math ReasoningAIME 2024, AIME 2025, and HMMT 2025 Aggregate
Macro Avg. Accuracy66.9
15
Math ReasoningAIME and HMMT
AIME 2024 Score72.1
13
Showing 2 of 2 rows