Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on AMO-Bench (accuracy)

59.8AMO-Bench Accuracy

SU-01

38.4844.01549.5555.085May 13, 2026
Updated 20d ago

Evaluation Results

MethodLinks
2026.05
59.8
2026.05
58.8
2026.05
53.8
2026.05
41.3
2026.05
40.8
2026.05
39.3