Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multiple-choice Question Answering on CommonsenseQA (CSQA)

66.4Accuracy

Llama-2-13b-chat (OTTER)

27.81637.83347.8557.867Apr 12, 2024Jul 31, 2024Nov 19, 2024Mar 10, 2025Jun 29, 2025Oct 18, 2025Feb 6, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2024.04
66.41.8
2024.04
64.53.1
2024.04
649.8
2024.04
63.412.9
2024.04
60.43.5
2024.04
58.12
2024.04
57.61
2024.04
5710.2
2024.04
56.98.3
2024.04
56.515.2
2024.04
42.73.8
2026.02
37.3-
2026.02
36.9-
2026.02
36.2-
2026.02
36.1-
2026.02
35.9-
2026.02
33.3-
2026.02
33.2-
2024.04
31.928.4
2026.02
29.6-
2026.02
29.3-