ZeroGen: Efficient Zero-shot Learning via Dataset Generation

About

There is a growing interest in dataset generation recently due to the superior generative capacity of large pre-trained language models (PLMs). In this paper, we study a flexible and efficient zero-short learning method, \textsc{ZeroGen}. Given a zero-shot task, we first generate a dataset from scratch using PLMs in an unsupervised manner. Then, we train a tiny task model (e.g., LSTM) under the supervision of the synthesized dataset. This approach allows highly efficient inference as the final task model only has orders of magnitude fewer parameters comparing to PLMs (e.g., GPT2-XL). Apart from being annotation-free and efficient, we argue that \textsc{ZeroGen} can also provide useful insights from the perspective of data-free model-agnostic knowledge distillation, and unreferenced text generation evaluation. Experiments and analysis on different NLP tasks, namely, text classification, question answering, and natural language inference, show the effectiveness of \textsc{ZeroGen}.

Jiacheng Ye, Jiahui Gao, Qintong Li, Hang Xu, Jiangtao Feng, Zhiyong Wu, Tao Yu, Lingpeng Kong• 2022

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	SVAMP	Accuracy20	403
Sentiment Classification	SST2 (test)	Accuracy80.41	233
Sentiment Analysis	SST-2	Accuracy82.77	165
Sentiment Classification	IMDB (test)	--	144
Topic Classification	AG News (test)	Accuracy76.48	116
Question Answering	SQuAD	Exact Match69.4	83
Sentiment Analysis	IMDB	Accuracy80.41	73
Question Answering	SQuAD v1.1 (val)	F1 Score31.53	70
Sequence Classification	Yahoo	Micro F155.04	64
Sequence Classification	MASSIVE	Micro F171.8	64

Showing 10 of 38 rows

Other info

Code

Follow for update

@wizwand_team Discord