Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OpenBookQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringOpenBookQA
Accuracy94.4
465
Question AnsweringOpenBookQA (OBQA) (test)
OBQA Accuracy92.4
130
Question AnsweringOpenBookQA
Accuracy84.4
126
Question AnsweringOpenBookQA
Accuracy95.2
119
Question AnsweringOpenBookQA
Normalized Accuracy55.6
102
ReasoningOpenBookQA
Accuracy88.4
77
Commonsense ReasoningOpenBookQA
Accuracy91
71
Multiple-choice Question AnsweringOpenBookQA (test)
Accuracy90.8
39
Open-book Question AnsweringOpenBookQA 1.0 (test)
Accuracy35
33
Zero-shot ReasoningOpenbookQA
Accuracy44
26
Question AnsweringOpenBook-QA
Accuracy91.6
24
Question AnsweringOpenbookQA (OQA) (val)
Accuracy36.6
22
Question AnsweringOpenBookQA (dev)
Accuracy90
22
Common SenseOpenBookQA
Accuracy81.8
21
Question AnsweringOpenBookQA
Composite Score92.14
20
Question AnsweringOpenBookQA
Attack Success Rate (ASR)100
20
Multiple Choice Question AnsweringOpenBookQA
Accuracy36.4
18
Question AnsweringOpenBookQA
OpQA Score47
15
Question AnsweringOpenBookQA
Mean Per-Step Regret0.157
15
Question AnsweringOpenBookQA published (test)
Accuracy65.4
15
Commonsense ReasoningOpenBookQA
Accuracy (Inter-layer)75.6
15
Question AnsweringOpenBookQA Official Leaderboard
Accuracy95.2
14
Audio Question-AnsweringOpenBookQA
Score91.4
12
Open Book Question AnsweringOpenBookQA
Normalized Log Accuracy89.4
12
Question AnsweringOpenBookQA D^v (train)
Accuracy100
12
Showing 25 of 60 rows