Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Zero-shot Reasoning on OpenBookQA

44Accuracy

Llama2-7B

3.4413.9724.535.03Feb 16, 2026Feb 23, 2026Mar 3, 2026Mar 10, 2026Mar 18, 2026Mar 25, 2026Apr 2, 2026
Updated 16d ago

Evaluation Results

MethodLinks
2026.02
44
2026.04
34
2026.02
32.8
2026.04
31
2026.02
30.4
2026.04
30
2026.02
29.2
2026.04
29
2026.04
27
2026.02
26
2026.04
26
2026.02
25.2
2026.04
25
2026.04
23
2026.04
22
2026.04
21
2026.04
16
2026.04
16
2026.04
14
2026.02
13.4
2026.04
10
2026.04
9
2026.04
8
2026.04
6
2026.04
6
2026.04
5