Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on OlympiadBench (Acc(%), Avg.Steps(n))

73.8Accuracy (%)

CRAFT

23.46436.53249.662.668Apr 15, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2026.04
73.87.6
2026.04
73.26.6
2026.04
7029
2026.04
6829.4
2026.04
6829.7
2026.04
66.518.5
2026.04
65.329
2026.04
646.5
2026.04
63.429.6
2026.04
62.428.3
2026.04
60.842.2
2026.04
58.429.2
2026.04
57.629.1
2026.04
5626
2026.04
55.610.2
2026.04
5125.9
2026.04
46.67.6
2026.04
463
2026.04
4027.8
2026.04
25.410.9