Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Quantitative Reasoning on AIME

28.7Accuracy

Ground Truth

3.29289.888916.48523.0811May 31, 2026
Updated 1d ago

Evaluation Results

MethodLinks
2026.05
28.7-
2026.05
28.4-
2026.05
27.991.7
2026.05
27.489.2
2026.05
27.4-
2026.05
27.388.4
2026.05
27.284.7
2026.05
27.183.7
2026.05
26.883.2
2026.05
26.779.6
2026.05
26.691
2026.05
26.580
2026.05
2683.5
2026.05
25.982.4
2026.05
25.678.8
2026.05
22.9-
2026.05
2292.5
2026.05
20.781.7
2026.05
19.672.5
2026.05
18.9-
2026.05
18.9-
2026.05
18.9-
2026.05
18.88-
2026.05
18.765
2026.05
10.91-
2026.05
10.9-
2026.05
4.27-
2026.05
4.27-