Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Logical Reasoning on ProofWriter (test)

92.32Accuracy

HBLR

27.091244.025660.9677.8944Oct 28, 2023Apr 1, 2024Sep 4, 2024Feb 7, 2025Jul 13, 2025Dec 16, 2025May 22, 2026
Updated 8d ago

Evaluation Results

MethodLinks
2025.12
92.32-
2025.12
89.48-
2025.12
89.41-
2025.12
88.34-
2026.05
87-
2025.12
86.27-
2025.12
84.35-
2025.12
84.15-
2025.12
83.38-
2025.12
82.48-
2025.12
82.11-
2025.12
79.34-
2023.10
79.1714.63
2026.05
77.4-
2026.05
75-
2026.05
74-
2023.10
7215.04
2026.05
72-
2023.10
71.6716.76
2023.10
70.6717.46
2026.05
70.4-
2023.10
70.3324.57
2023.10
69.3316
2026.05
69-
2023.10
68.8316.52
2025.12
68.11-
2026.05
67.6-
2023.10
67.411
2026.05
67-
2026.05
66-
2025.12
63.24-
2026.05
61.4-
2023.10
59.1618.81
2025.12
59.03-
2025.12
58.62-
2026.05
57-
2026.05
57-
2025.12
56.84-
2023.10
55.1716.89
2025.12
54.46-
2023.10
54.1624.88
2025.12
52.67-
2023.10
50.1718.49
2025.12
49.17-
2026.05
49-
2023.10
48.6716
2026.05
47-
2023.10
46.831
2026.05
46-
2023.10
451
2026.05
42.6-
2026.05
42.4-
2026.05
38-
2025.12
36.53-
2023.10
36.171
2026.05
33-
2026.05
29.6-