Share your thoughts, 1 month free Claude Pro on usSee more

Mathematical Reasoning on AIME 25 (AUCOAA)

80AUCOAA

Adaptive-Answer

Updated 4mo ago

Evaluation Results

Method	Links
Adaptive-Answer 2026.01		80
Format-Adaptive-Answer 2026.01		80
TWYN 2026.01		79.6
Hard-Length 8k 2026.01		77.2
SFT 2026.01		76.9
Hard-Length 16k 2026.01		76.3
Base model 2026.01		75.8
Soft-Length 2026.01		72.9
Hard-Length 8k → 4k 2026.01		69.8
Normalized-Length 2026.01		61
No-Thinking 2026.01		16