Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Question Answering on OBQA (acc_norm)

41.8Accuracy (Normalized)

No pruning (Original)

21.62426.86232.137.338Nov 19, 2025Dec 10, 2025Dec 31, 2025Jan 21, 2026Feb 11, 2026Mar 4, 2026Mar 25, 2026
Updated 6d ago

Evaluation Results

MethodLinks
2026.03
41.8
2025.11
38.8
2025.11
38.8
2025.11
38.2
2025.11
37.2
2025.11
36
2026.03
35.4
2025.11
35.2
2025.11
35
2025.11
35
2025.11
34.6
2025.11
34.4
2025.11
33.2
2025.11
32.2
2025.11
32.2
2026.03
31.2
2025.11
31.2
2026.03
30
2025.11
30
2025.11
30
2025.11
29
2025.11
28.8
2026.03
27.6
2025.11
27.2
2026.03
27
2026.03
27
2026.03
27
2026.03
25
2025.11
22.4