Our new X account is live! Follow @wizwand_team for updates
Search any
task
Feedback
Search any
task
SOTA Commonsense Question Answering benchmarks and papers with code | Wizwand
Our new X account is live! Follow @wizwand_team for updates
Home
/
Tasks
Commonsense Question Answering
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
CSQA (test)
Human
Accuracy
0.953
127
3d ago
CommonSenseQA
HUMAN
Accuracy
88.9
81
3d ago
CSQA
SAC Single-task
Accuracy
82.72
44
3d ago
CosmosQA
HUMAN
Accuracy
94
36
3d ago
SocialIQA (SIQA) (val)
ChatGPT + Chain-of-thought
Accuracy
70.7
24
3d ago
CommonsenseQA (CSQA) (val)
ChatGPT + Self-consistent chain-of-thought
Accuracy
75.7
23
3d ago
CommonsenseQA v1.0 (dev)
Our Model
Accuracy
79.3
22
3d ago
WinoGrande (WG) (val)
DeBERTa-v3-L (CANDLE Distilled)
Accuracy
78.3
21
3d ago
Abductive NLI (aNLI) (val)
DeBERTa-v3-L (CANDLE Distilled)
Accuracy
0.812
21
3d ago
CommonsenseQA blind v1.0 (test)
Our Model
Accuracy
75.3
20
3d ago
CSQA
QTALE
PIQA
84.06
18
3d ago
MCEval CSQA 8K (test)
Magn-Probe
Accuracy
84.6
14
3d ago
Commonsense QA
Phi
Reusability Score
50.97
12
3d ago
CSQA
UDPO
Accuracy
85.1
12
3d ago
CSQA2 (test)
UL20B
Accuracy
70.1
11
3d ago
CSQA (OOD)
R1 Distill -> GRPO
Accuracy
63.8
10
3d ago
ARC-E
Dense
Accuracy
72.31
8
3d ago
ARC-C-ZH
Dense
Score
33.96
8
3d ago
ARC-C
PHSA
Accuracy
41.13
8
3d ago
Scientific Commonsense (QASC) 1.0 (dev)
GPT-3
Accuracy
55.18
8
3d ago
CommonsenseQA (CSQA2) 2.0 (dev)
ELABOR
Accuracy
58.72
8
3d ago
ECQA (test)
Llama2-70B
Accuracy
79.7
7
3d ago
QASC (dev)
CPACE
Accuracy
83.7
7
3d ago
CSQA Synonym Replacement WordNet-based (test)
G-DAUG-Rand
Accuracy
72.1
6
3d ago
OpenBookQA (OBQA) 1.0 (test)
GPT-3
Accuracy
59.4
5
3d ago
Showing 25 of 34 rows
25 / page
50 / page
100 / page
1
2
Search any
task
Search any
task
Terms of Service
FAQs