Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Natural Language Understanding on AGIEval

71.6Accuracy

Llama 3 405B

11.69627.24842.858.352Nov 7, 2023Apr 9, 2024Sep 11, 2024Feb 13, 2025Jul 18, 2025Dec 20, 2025May 24, 2026
Updated 8d ago

Evaluation Results

MethodLinks
2024.07
71.6
2024.07
64.6
2024.07
61.5
2025.12
48.13
2024.07
47.8
2024.07
46
2024.07
42.7
2025.12
42.14
2026.05
35.11
2025.12
33.34
2023.11
32.7
2026.05
31.77
2026.05
31.66
2026.05
30.51
2024.10
29.3
2023.11
28.5
2024.10
28.5
2025.12
28.05
2024.03
27.8
2026.05
27.01
2025.12
26.32
2026.05
26.15
2023.11
23.2
2024.10
23.2
2023.11
21.8
2024.10
21.8
2023.11
21.2
2024.10
21.2
2024.03
19.3
2024.03
14