AIME and HMMT

Benchmarks

Task Name	Dataset Name	SOTA Result
Mathematical Reasoning	AIME and HMMT Average	Pass@1 Absolute45	26
Math Reasoning	AIME 2024, AIME 2025, and HMMT 2025 Aggregate	Macro Avg. Accuracy66.9	15
Math Reasoning	AIME and HMMT	AIME 2024 Score72.1	13

Showing 3 of 3 rows