Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Discrete Reasoning on DROP

71.59Exact Match (EM)

GPT-4

-2.863616.465735.79555.1243Nov 18, 2023Apr 18, 2024Sep 17, 2024Feb 16, 2025Jul 18, 2025Dec 17, 2025May 18, 2026
Updated 14d ago

Evaluation Results

MethodLinks
2023.11
71.59-
2023.11
70.88-
2023.11
69.09-
2023.11
64.39-
2023.11
60.26-
2023.11
59.62-
2023.11
57.97-
2023.11
54.11-
2023.11
53.63-
2023.11
45.97-
2023.11
40.73-
2026.05
37.94-
2026.05
37.9-
37.69-
37.19-
36.57-
36.48-
2025.12
10.324.3
2025.12
8.521.6
2025.12
2.814.6
2025.12
2.58.6
2025.12
2.413.3
2025.12
2.29.9
2025.12
00
2025.12
00