Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Natural Language Understanding on ARC Challenge

95.3Accuracy

LLaMA-3.1-405B Base

38.152.9567.882.65Oct 17, 2024Jan 21, 2025Apr 28, 2025Aug 2, 2025Nov 7, 2025Feb 11, 2026May 19, 2026
Updated 13d ago

Evaluation Results

MethodLinks
2026.01
95.3-
2026.01
95.3-
2026.01
94.3-
2026.02
78.4-
2026.02
77.6-
2026.02
75-
2026.02
74.9-
2026.02
70.2-
2026.02
63.9-
2026.05
62.5478.64
2024.10
61.4-
2024.10
56.6-
2024.10
54.9-
2024.10
47.5-
2026.05
43.3466.89
2024.10
40.3-