CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering

About

The task of zero-shot commonsense question answering evaluates models on their capacity to reason about general scenarios beyond those presented in specific datasets. Existing approaches for tackling this task leverage external knowledge from CommonSense Knowledge Bases (CSKBs) by pretraining the model on synthetic QA pairs constructed from CSKBs. In these approaches, negative examples (distractors) are formulated by randomly sampling from CSKBs using fairly primitive keyword constraints. However, two bottlenecks limit these approaches: the inherent incompleteness of CSKBs limits the semantic coverage of synthetic QA pairs, and the lack of human annotations makes the sampled negative examples potentially uninformative and contradictory. To tackle these limitations above, we propose Conceptualization-Augmented Reasoner (CAR), a zero-shot commonsense question-answering framework that fully leverages the power of conceptualization. Specifically, CAR abstracts a commonsense knowledge triple to many higher-level instances, which increases the coverage of CSKB and expands the ground-truth answer space, reducing the likelihood of selecting false-negative distractors. Extensive experiments demonstrate that CAR more robustly generalizes to answering questions about zero-shot commonsense scenarios than existing methods, including large language models, such as GPT3.5 and ChatGPT. Our codes, data, and model checkpoints are available at https://github.com/HKUST-KnowComp/CAR.

Weiqi Wang, Tianqing Fang, Wenxuan Ding, Baixuan Xu, Xin Liu, Yangqiu Song, Antoine Bosselut• 2023

Related benchmarks

Task	Dataset	Result
Commonsense Reasoning	WinoGrande	Accuracy78.2	1442
Physical Commonsense Reasoning	PIQA	Accuracy78.6	696
Physical Interaction Question Answering	PIQA	Accuracy78.6	415
Social Interaction Question Answering	SIQA	Accuracy64.8	157
Physical Commonsense Reasoning	PIQA (val)	Accuracy78.6	118
Social Commonsense Reasoning	SIQA	Accuracy64	112
Commonsense Question Answering	CSQA	Accuracy69.3	71
Abductive Commonsense Reasoning	ANLI (test)	Accuracy79.6	53
Science Question Answering	ARC-C (test)	Accuracy53.2	48
Abductive Natural Language Inference	aNLI (leaderboard)	Accuracy79.6	47

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord