Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

There Are Many Consistent Explanations of Unlabeled Data: Why You Should Average

About

Presently the most successful approaches to semi-supervised learning are based on consistency regularization, whereby a model is trained to be robust to small perturbations of its inputs and parameters. To understand consistency regularization, we conceptually explore how loss geometry interacts with training procedures. The consistency loss dramatically improves generalization performance over supervised-only training; however, we show that SGD struggles to converge on the consistency loss and continues to make large steps that lead to changes in predictions on the test data. Motivated by these observations, we propose to train consistency-based methods with Stochastic Weight Averaging (SWA), a recent approach which averages weights along the trajectory of SGD with a modified learning rate schedule. We also propose fast-SWA, which further accelerates convergence by averaging multiple points within each cycle of a cyclical learning rate schedule. With weight averaging, we achieve the best known semi-supervised results on CIFAR-10 and CIFAR-100, over many different quantities of labeled training data. For example, we achieve 5.0% error on CIFAR-10 with only 4000 labels, compared to the previous best result in the literature of 6.3%.

Ben Athiwaratkun, Marc Finzi, Pavel Izmailov, Andrew Gordon Wilson• 2018

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100 (test)
Accuracy72.11
3518
Image ClassificationCIFAR-10 (test)
Accuracy95
3381
Image ClassificationCIFAR-10--
507
Image ClassificationCIFAR10--
70
Image ClassificationCIFAR-10 4,000 labels (test)
Test Error Rate5
57
Image ClassificationCIFAR-100 10k labels
Test Error Rate0.3362
29
Image ClassificationCIFAR-100
Top-1 Error Rate34.1
18
Image ClassificationCIFAR-10 1k labels (test)
Test Error Rate16.84
9
Image ClassificationCIFAR-10 2k labels (test)
Test Error Rate12.24
8
Domain AdaptationSTL-10 (test)
Test Error16.8
5
Showing 10 of 10 rows

Other info

Code

Follow for update