Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MaxUp: A Simple Way to Improve Generalization of Neural Network Training

About

We propose \emph{MaxUp}, an embarrassingly simple, highly effective technique for improving the generalization performance of machine learning models, especially deep neural networks. The idea is to generate a set of augmented data with some random perturbations or transforms and minimize the maximum, or worst case loss over the augmented data. By doing so, we implicitly introduce a smoothness or robustness regularization against the random perturbations, and hence improve the generation performance. For example, in the case of Gaussian perturbation, \emph{MaxUp} is asymptotically equivalent to using the gradient norm of the loss as a penalty to encourage smoothness. We test \emph{MaxUp} on a range of tasks, including image classification, language modeling, and adversarial certification, on which \emph{MaxUp} consistently outperforms the existing best baseline methods, without introducing substantial computational overhead. In particular, we improve ImageNet classification from the state-of-the-art top-1 accuracy $85.5\%$ without extra data to $85.8\%$. Code will be released soon.

Chengyue Gong, Tongzheng Ren, Mao Ye, Qiang Liu• 2020

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-2 (test)
PPL39.61
1541
Image ClassificationImageNet-1k (val)
Top-1 Accuracy85.8
1453
Image ClassificationImageNet (val)
Top-1 Acc78.9
1206
Image ClassificationCIFAR-10 (test)
Accuracy97.18
906
Image ClassificationImageNet-1k (val)
Top-1 Acc85.8
706
Image ClassificationImageNet ILSVRC-2012 (val)
Top-1 Accuracy78.9
405
Language ModelingWikiText2 (val)
Perplexity (PPL)41.29
277
Image ClassificationImageNet 2012 (val)
Top-1 Accuracy85.8
202
Image ClassificationCIFAR100 (test)
Test Accuracy82.48
147
Language ModelingPenn Treebank (PTB) (test)
Perplexity50.29
120
Showing 10 of 12 rows

Other info

Follow for update