Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CommonsenseQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Selective PredictionCommonsenseQA
Power0.9999
207
Question AnsweringCommonsenseQA
Accuracy89.3
143
Commonsense ReasoningCommonSenseQA
Accuracy91.2
132
Question AnsweringCommonsenseQA (CSQA)
Accuracy91.2
124
Commonsense Question AnsweringCommonSenseQA
Accuracy88.9
81
Question AnsweringCommonsenseQA IH (test)
Accuracy88.9
57
Commonsense ReasoningCommonSenseQA
BS0.1054
54
Question AnsweringCommonsenseQA IH (dev)
Accuracy82.7
53
Commonsense ReasoningCommonsenseQA (val)
Accuracy82.06
52
Hallucination DetectionCommonsenseQA
Mean AUROC0.7563
48
Commonsense ReasoningCommonsenseQA (CSQA) v1.0 (test)
Accuracy64.11
46
Question AnsweringCommonsenseQA (test)
Accuracy83.3
42
Commonsense ReasoningCommonsenseQA (test)
Accuracy90
41
Commonsense ReasoningCommonsenseQA (CSQA)
Accuracy79
38
Commonsense ReasoningCommonsenseQA Non-Math
Accuracy87.31
32
RetrievalCommonsenseQA
Accuracy86.81
25
Commonsense Question AnsweringCommonsenseQA (CSQA) (val)
Accuracy75.7
23
Commonsense Question AnsweringCommonsenseQA v1.0 (dev)
Accuracy79.3
22
Multiple-choice Question AnsweringCommonsenseQA (CSQA)
Accuracy66.4
21
Veracity InferenceCOMMONSENSEQA 1,000 examples
Mean Hamming Similarity0.935
20
KnowledgeCommonSenseQA CoQA
Score66.91
20
Commonsense Question AnsweringCommonsenseQA blind v1.0 (test)
Accuracy75.3
20
Multiple-choice Question AnsweringCommonsenseQA (dev)
Accuracy76.2
18
Question AnsweringCommonsenseQA
PR-AUC0.595
16
Common senseCommonsenseQA
Accuracy74
12
Showing 25 of 32 rows