Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Science Question Answering on ARC-C

96.3Accuracy

GPT-4

31.97648.675565.37582.0745Oct 20, 2022May 10, 2023Nov 29, 2023Jun 18, 2024Jan 7, 2025Jul 28, 2025Feb 16, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2024.07
96.3---
96.1---
94.3---
92.9---
91.9---
2025.12
89.83---
2022.10
89.8---
2022.10
88.7---
2022.10
88.7---
2026.02
88.52350--
2025.12
88.5---
2022.10
88.3---
2022.10
87.2---
2022.10
87.1---
2026.02
86.73301--
2025.12
86.44---
2026.02
85.95361--
2022.10
85.2---
2026.02
83.05585--
2025.12
82.8---
2025.12
82.1---
2026.02
81.17793--
2026.02
81.05366--
2025.12
80.34---
2025.12
80---
2024.07
79.7---
2025.12
79.32---
2026.02
79.15642--
2024.07
78.6---
2024.07
78.2---
2025.12
78---
2025.12
77.6---
2025.12
77.29---
2025.12
77.2---
2026.02
75.48---
2026.02
75.13---
2025.12
74.9---
2026.02
74.78---
2026.02
74.7---
2026.02
74.18---
2026.02
74.13---
2026.02
74.02353--
2026.02
73.61---
2026.02
72.91617--
2025.12
72.2---
2025.12
72.2---
2025.12
70.5---
2025.12
70.17---
2026.02
69.79369--
2025.12
68.14---
2026.02
67.67785--
2026.02
67.22610--
2025.12
66.44---
2025.12
66.1---
2026.02
64.88628--
2025.12
64.75---
2026.02
64.33769--
2026.02
64.1614--
2026.01
63.7---
2025.12
62.4---
2026.01
60.6---
2026.01
60---
2025.12
59.66---
2025.12
59---
2025.12
58---
2025.12
57.6---
2025.12
56.6---
2025.05
56.19---
2025.12
55.9---
2025.05
54.84---
2025.05
54.51---
2026.01
54.4---
2025.05
53.51---
2026.02
53.5---
2026.01
53.1---
2026.01
48.1---
2025.12
47.46---
2026.01
45.9---
2026.02
45.3---
2026.02
43.7---
2026.02
43.01---
2026.02
41.6---
2025.12
41.51---
2025.12
41.47---
2025.12
41.14---
2025.12
41.14---
2025.12
40.47---
2026.02
37.9---
2025.12
36.95---
2026.02
36.6---
2022.10
36.5---
2022.10
36.5---
2025.12
36.45---
2025.12
36.12---
2025.12
35.45---
2025.12
35.12---
2022.10
34.8---
2022.10
34.8---
2025.12
34.45---
2025.12
34.45---
Showing 100 of 139 rows