Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reasoning on ARC Challenge

96.7Accuracy

GPT-4o

14.74836.02457.378.576Nov 18, 2023Apr 13, 2024Sep 8, 2024Feb 3, 2025Jul 1, 2025Nov 26, 2025Apr 23, 2026
Updated 16d ago

Evaluation Results

MethodLinks
2024.10
96.7-
96.4-
2023.11
93.26-
2024.10
91-
2025.10
90.1-
2025.10
88.6-
2025.10
87.6-
2023.11
84.73-
83.4-
2023.11
83.36-
2023.11
79.95-
2023.11
78.41-
2023.11
74.83-
2023.11
74.74-
2025.10
74.7-
2023.11
71.93-
70.7-
69.6-
2024.07
68.9-
68.8-
2023.11
67.66-
65.9-
2025.10
62.4-
2023.11
61.18-
2024.07
61.1-
2026.04
54.77-
2026.04
53.67-
2025.10
52.9-
2026.04
51.62-
2026.04
50.68-
2023.11
50.43-
2024.10
50.3-
2024.10
49.7-
2024.10
49.6-
2024.10
49-
2026.04
48.98-
2024.10
48.8-
2024.10
48.7-
2026.04
48.54-
2024.07
48.5-
2026.04
48.21-
2024.07
43.9-
2026.04
43.6-
2026.04
41.38-
2026.04
41.38-
2026.04
40.53-
2026.04
39.93-
2026.04
39.76-
2026.04
38.31-
2024.07
37.9-
2026.04
37.46-
2026.04
35-
2026.04
35-
34.3-
2026.04
34.22-
2026.04
34-
2026.04
33.53-
2026.04
33.4-
2026.04
32.51-
2026.04
32.3-
2026.04
32.17-
2026.04
32-
2026.04
31.8-
2024.07
31.5-
2026.04
30.3-
2026.04
28.5-
2026.04
27.55-
2025.02
26.6-
2025.02
26.5-
2025.02
26.4-
2025.02
26-
2025.02
24.9-
2025.02
24.9-
2025.02
24.8-
2025.02
24.1-
2025.02
23.6-
2025.02
22.6-
2025.02
21.8-
2025.02
21.3-
2025.02
20.6-
2025.02
17.9-
2026.02
-43.1
2026.02
-45.4
2026.02
-46.3
2026.02
-46.7
2026.02
-44.6
2026.02
-46.7
2026.02
-46.9
2026.02
-76.1
2026.02
-74.1
2026.02
-74.4
2026.02
-77.1
2026.02
-76.7
2026.02
-78.2
2026.02
-79.7
2025.08
-91.54
2025.08
-92.22
2025.08
-91.86
2025.08
-92.09
2025.08
-92.29
Showing 100 of 105 rows