Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Logical Reasoning on ZebraLogic v1.0 (test)

97.7Cell Accuracy

In-place

15.33236.71658.179.484Oct 1, 2025
Updated 5d ago

Evaluation Results

MethodLinks
2025.10
97.7
2025.10
97.2
2025.10
96.8
2025.10
96.3
2025.10
95.9
2025.10
95.6
2025.10
94.7
2025.10
94.4
2025.10
93.5
2025.10
93.1
2025.10
92.7
2025.10
92.1
2025.10
91.4
2025.10
89.6
2025.10
89.1
2025.10
89.1
2025.10
89.1
2025.10
88.9
2025.10
88.3
2025.10
87.1
2025.10
86.6
2025.10
85.9
2025.10
85.9
2025.10
84.8
2025.10
84.7
2025.10
84.1
2025.10
83.5
2025.10
83.3
2025.10
83
2025.10
80.7
2025.10
80.6
2025.10
79.8
2025.10
79.4
2025.10
79.4
2025.10
77.9
2025.10
75.8
2025.10
75.1
2025.10
75
2025.10
73.8
2025.10
73.6
2025.10
71.8
2025.10
70.1
2025.10
69.8
2025.10
69.3
2025.10
69
2025.10
68.1
2025.10
67.9
2025.10
67.1
2025.10
66.3
2025.10
65.2
2025.10
61.8
2025.10
61.2
2025.10
60.7
2025.10
60.3
2025.10
59.9
2025.10
59.1
2025.10
58.6
2025.10
57.3
2025.10
57.1
2025.10
56.7
2025.10
54.8
2025.10
54.6
2025.10
54.3
2025.10
52.8
2025.10
51.8
2025.10
50.5
2025.10
48.7
2025.10
48
2025.10
46.4
2025.10
46
2025.10
44.3
2025.10
43.5
2025.10
43.2
2025.10
41.7
2025.10
41.1
2025.10
40.6
2025.10
36.7
2025.10
36.6
2025.10
35
2025.10
33.9
2025.10
27.2
2025.10
27.2
2025.10
27.2
2025.10
26.4
2025.10
26.4
2025.10
26.3
2025.10
19.1
2025.10
19.1
2025.10
19.1
2025.10
18.5