Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Zero-shot reasoning on ZeroShot 7

71.57Accuracy

FP

34.535644.150353.76563.3797Feb 3, 2026Feb 20, 2026Mar 10, 2026Mar 28, 2026Apr 15, 2026May 3, 2026May 21, 2026
Updated 9d ago

Evaluation Results

MethodLinks
2026.05
71.57
2026.05
70.97
2026.02
67
2026.02
66
2026.05
65.69
2026.05
65.58
2026.05
65.49
2026.05
65.13
2026.02
65
2026.05
64.34
2026.05
64.24
2026.02
64
2026.02
63
2026.05
62.32
2026.05
61.96
2026.05
61.28
2026.02
61
2026.02
61
2026.05
60.73
2026.05
60.53
2026.05
60.38
2026.02
60
2026.05
59.98
2026.05
59.81
2026.05
59.6
2026.02
59
2026.05
58.38
2026.02
58
2026.02
57
2026.02
56
2026.05
55.72
2026.05
54.87
2026.05
54.57
2026.02
54
2026.05
53.8
2026.05
52.52
2026.05
52.45
2026.05
52.41
2026.02
52
2026.05
52
2026.05
51.78
2026.05
51.39
2026.05
50.55
2026.05
50.06
2026.02
50
2026.05
49.42
2026.02
49
2026.05
47.58
2026.05
47.45
2026.05
47.27
2026.05
46.91
2026.05
45.07
2026.05
44.12
2026.05
42.08
2026.05
41.06
2026.05
35.96