Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Grounded Chess Reasoning on Chess Puzzles (test)

95Accuracy (Beginner)

gpt-5

-3.821.8547.573.15Mar 20, 2026
Updated 2mo ago

Evaluation Results

MethodLinks
2026.03
9584543185.276.712,193
2026.03
8886704478.275.43,182
2026.03
655934193840.86,418
2026.03
6539392253.648.1178
2026.03
5736272746.642.2189
2026.03
5239271841.838.3925
2026.03
5130302646.240.9188
2026.03
373129193130.19,668
2026.03
3519161026.823.88,028
2026.03
3324141125.623.38,111
2026.03
3229151128.625.63,227
2026.03
2721616222011,249
2026.03
241414817.616.413,938
2026.03
221531613.813.93,393
2026.03
1285108.68.71,092
2026.03
111014161614.614,442
2026.03
94658.27.29,991
2026.03
967487.32,818
2026.03
00010.40.3806
2026.03
000000705