Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OpenBookQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringOpenBookQA
Accuracy94.4
465
Question AnsweringOpenBookQA
Accuracy96.07
305
Question AnsweringOpenBookQA
Accuracy84.4
145
Question AnsweringOpenBookQA (OBQA) (test)
OBQA Accuracy92.4
130
Question AnsweringOpenBookQA
Accuracy95.2
119
Commonsense ReasoningOpenBookQA
Accuracy91.2
108
Question AnsweringOpenBookQA
Normalized Accuracy55.6
102
ReasoningOpenBookQA
Accuracy88.4
92
Science Question AnsweringOpenBookQA
Accuracy94.6
82
Multiple-choice Question AnsweringOpenBookQA (test)
Accuracy91
61
Open-book Question AnsweringOpenBookQA 1.0 (test)
Accuracy35
33
Zero-shot ReasoningOpenbookQA
Accuracy44
26
Multiple Choice Question AnsweringOpenBookQA
Accuracy44.4
26
Question AnsweringOpenBook-QA
Accuracy91.6
24
Question AnsweringOpenbookQA
Accuracy87.6
23
Question AnsweringOpenbookQA (OQA) (val)
Accuracy36.6
22
Question AnsweringOpenBookQA (dev)
Accuracy90
22
Common SenseOpenBookQA
Accuracy81.8
21
Question AnsweringOpenBookQA
Composite Score92.14
20
Question AnsweringOpenBookQA
Attack Success Rate (ASR)100
20
Question AnsweringOpenBookQA
Accuracy85.1
17
Question AnsweringOpenBookQA
OpQA Score47
15
Question AnsweringOpenBookQA
Mean Per-Step Regret0.157
15
Question AnsweringOpenBookQA published (test)
Accuracy65.4
15
Commonsense ReasoningOpenBookQA
Accuracy (Inter-layer)75.6
15
Showing 25 of 72 rows