Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

About

Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks: Unidirectional PLMs (e.g., GPT) are well known for their superior text generation capabilities; bidirectional PLMs (e.g., BERT) have been the prominent choice for natural language understanding (NLU) tasks. While both types of models have achieved promising few-shot learning performance, their potential for zero-shot learning has been underexplored. In this paper, we present a simple approach that uses both types of PLMs for fully zero-shot learning of NLU tasks without requiring any task-specific data: A unidirectional PLM generates class-conditioned texts guided by prompts, which are used as the training data for fine-tuning a bidirectional PLM. With quality training data selected based on the generation probability and regularization techniques (label smoothing and temporal ensembling) applied to the fine-tuning stage for better generalization and stability, our approach demonstrates strong performance across seven classification tasks of the GLUE benchmark (e.g., 72.3/73.8 on MNLI-m/mm and 92.8 on SST-2), significantly outperforming zero-shot prompting methods and achieving even comparable results to strong few-shot approaches using 32 training samples per class.

Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han• 2022

Related benchmarks

Task	Dataset	Result
Natural Language Understanding	GLUE	SST-292.8	551
Sentiment Analysis	SST-2	Accuracy86.7	165
Topic Classification	AG News (test)	Accuracy77.4	116
Sentiment Analysis	IMDB	Accuracy84.58	73
Topic Classification	DBPedia (test)	Accuracy66.5	64
Sentiment Classification	Yelp (test)	Accuracy93.6	46
Topic Classification	Yahoo (test)	Accuracy40.8	36
Sentiment Analysis	Yelp	Accuracy89.98	34
Sentiment Analysis	Rotten Tomato	Accuracy79.08	25
Topic Classification	NYT (test)	Accuracy53.9	18

Showing 10 of 12 rows

Other info

Code

Follow for update

@wizwand_team Discord