Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

About

Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks: Unidirectional PLMs (e.g., GPT) are well known for their superior text generation capabilities; bidirectional PLMs (e.g., BERT) have been the prominent choice for natural language understanding (NLU) tasks. While both types of models have achieved promising few-shot learning performance, their potential for zero-shot learning has been underexplored. In this paper, we present a simple approach that uses both types of PLMs for fully zero-shot learning of NLU tasks without requiring any task-specific data: A unidirectional PLM generates class-conditioned texts guided by prompts, which are used as the training data for fine-tuning a bidirectional PLM. With quality training data selected based on the generation probability and regularization techniques (label smoothing and temporal ensembling) applied to the fine-tuning stage for better generalization and stability, our approach demonstrates strong performance across seven classification tasks of the GLUE benchmark (e.g., 72.3/73.8 on MNLI-m/mm and 92.8 on SST-2), significantly outperforming zero-shot prompting methods and achieving even comparable results to strong few-shot approaches using 32 training samples per class.

Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han• 2022

Related benchmarks

TaskDatasetResultRank
Natural Language UnderstandingGLUE
SST-292.8
452
Sentiment AnalysisSST-2
Accuracy86.7
156
Topic ClassificationAG News (test)
Accuracy77.4
98
Topic ClassificationDBPedia (test)
Accuracy66.5
64
Sentiment AnalysisIMDB
Accuracy84.58
57
Sentiment ClassificationYelp (test)
Accuracy93.6
46
Topic ClassificationYahoo (test)
Accuracy40.8
36
Sentiment AnalysisYelp
Accuracy89.98
30
Sentiment AnalysisRotten Tomato
Accuracy79.08
25
Topic ClassificationNYT (test)
Accuracy53.9
18
Showing 10 of 12 rows

Other info

Code

Follow for update