Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering

About

The task of zero-shot commonsense question answering evaluates models on their capacity to reason about general scenarios beyond those presented in specific datasets. Existing approaches for tackling this task leverage external knowledge from CommonSense Knowledge Bases (CSKBs) by pretraining the model on synthetic QA pairs constructed from CSKBs. In these approaches, negative examples (distractors) are formulated by randomly sampling from CSKBs using fairly primitive keyword constraints. However, two bottlenecks limit these approaches: the inherent incompleteness of CSKBs limits the semantic coverage of synthetic QA pairs, and the lack of human annotations makes the sampled negative examples potentially uninformative and contradictory. To tackle these limitations above, we propose Conceptualization-Augmented Reasoner (CAR), a zero-shot commonsense question-answering framework that fully leverages the power of conceptualization. Specifically, CAR abstracts a commonsense knowledge triple to many higher-level instances, which increases the coverage of CSKB and expands the ground-truth answer space, reducing the likelihood of selecting false-negative distractors. Extensive experiments demonstrate that CAR more robustly generalizes to answering questions about zero-shot commonsense scenarios than existing methods, including large language models, such as GPT3.5 and ChatGPT. Our codes, data, and model checkpoints are available at https://github.com/HKUST-KnowComp/CAR.

Weiqi Wang, Tianqing Fang, Wenxuan Ding, Baixuan Xu, Xin Liu, Yangqiu Song, Antoine Bosselut• 2023

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningWinoGrande
Accuracy78.2
1085
Physical Commonsense ReasoningPIQA
Accuracy78.6
572
Physical Interaction Question AnsweringPIQA
Accuracy78.6
333
Physical Commonsense ReasoningPIQA (val)
Accuracy78.6
116
Social Interaction Question AnsweringSIQA
Accuracy64.8
109
Social Commonsense ReasoningSIQA
Accuracy64
89
Commonsense Question AnsweringCSQA
Accuracy69.3
58
Abductive Commonsense ReasoningANLI (test)
Accuracy79.6
53
Abductive Natural Language InferenceaNLI (leaderboard)
Accuracy79.6
47
Science Question AnsweringARC-C (test)
Accuracy53.2
40
Showing 10 of 19 rows

Other info

Follow for update