FlipDA: Effective and Robust Data Augmentation for Few-Shot Learning

About

Most previous methods for text data augmentation are limited to simple tasks and weak baselines. We explore data augmentation on hard tasks (i.e., few-shot natural language understanding) and strong baselines (i.e., pretrained models with over one billion parameters). Under this setting, we reproduced a large number of previous augmentation methods and found that these methods bring marginal gains at best and sometimes degrade the performance much. To address this challenge, we propose a novel data augmentation method FlipDA that jointly uses a generative model and a classifier to generate label-flipped data. Central to the idea of FlipDA is the discovery that generating label-flipped data is more crucial to the performance than generating label-preserved data. Experiments show that FlipDA achieves a good tradeoff between effectiveness and robustness -- it substantially improves many tasks while not negatively affecting the others.

Jing Zhou, Yanan Zheng, Jie Tang, Jian Li, Zhilin Yang• 2021

Related benchmarks

Task	Dataset	Result
Sentiment Analysis	SST-2 (test)	Accuracy94.3	144
Commonsense Question Answering	CSQA (test)	Accuracy0.77	127
Topic Classification	AG News (test)	Accuracy85.2	116
Natural Language Inference	MNLI (matched)	Accuracy68.8	110
Natural Language Inference	MNLI (mismatched)	Accuracy68.9	68
Aspect-based Sentiment Analysis	SemEval Restaurant 2014 (All)	F1 Score51.38	19
Aspect-based Sentiment Analysis	SemEval Laptop 2014	F1 Score32.81	19
Natural Language Understanding	SuperGLUE few-shot	BoolQ Accuracy0.818	16
Emotion Classification	TweetEmo (test)	Accuracy76.7	13
Conditional Text Generation	CommonGen	ROUGE-146.81	6

Showing 10 of 10 rows

Other info

Code

Follow for update

@wizwand_team Discord