Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Problem Solving on AIME VeRA-H 2024-II

0.909Avg@5 (%)

GPT-5.1-high

0.120680.325340.530.73466Jan 23, 2026
Updated 4d ago

Evaluation Results

MethodLinks
0.909-3.4
0.84-7.4
2026.01
0.823-6.3
0.82-10.9
2026.01
0.811-13.1
0.811-8.9
0.766-10.6
0.766-16.3
0.746-1.1
0.72-18
0.703-18.3
2026.01
0.643-18.6
0.591-20.9
0.406-30.9
0.343-18.6
0.151-70.6