Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Reasoning on AIME (Pass@1 Accuracy, Length Exceeding Ratio)

63.7Pass@1 Accuracy

-

21.47632.43843.454.362Jan 8, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2026.01
63.771.3
2026.01
56.90.1
2026.01
55.485.6
2026.01
54.62.5
2026.01
53.10.2
2026.01
50.22.1
2026.01
29.891.5
2026.01
29.46.5
2026.01
23.110.8