Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

About

We present EDA: easy data augmentation techniques for boosting performance on text classification tasks. EDA consists of four simple but powerful operations: synonym replacement, random insertion, random swap, and random deletion. On five text classification tasks, we show that EDA improves performance for both convolutional and recurrent neural networks. EDA demonstrates particularly strong results for smaller datasets; on average, across five datasets, training with EDA while using only 50% of the available training set achieved the same accuracy as normal training with all available data. We also performed extensive ablation studies and suggest parameters for practical use.

Jason Wei, Kai Zou• 2019

Related benchmarks

TaskDatasetResultRank
Natural Language InferenceSNLI (test)
Accuracy72.68
681
Question AnsweringSQuAD v1.1 (dev)
F1 Score32.4
375
Text ClassificationAG-News
Accuracy89.6
248
Text ClassificationTREC
Accuracy94.7
179
Question AnsweringNewsQA (dev)
F1 Score61.01
101
Few-shot Text Classification26 few-shot tasks Random -> Random transfer setting (test)
Accuracy45.79
84
Few-shot Text Classification26 few-shot tasks Class -> Non-Class transfer setting (test)
Accuracy43.51
84
Few-shot Text Classification26 few-shot tasks Class -> Class transfer setting (test)
Accuracy45.9
84
Few-shot Text Classification26 few-shot tasks Non-Class -> Class transfer setting (test)
Accuracy0.4704
84
Sequence ClassificationIMDB
Micro F190.2
64
Showing 10 of 65 rows

Other info

Follow for update