ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring
About
We improve the recently-proposed "MixMatch" semi-supervised learning algorithm by introducing two new techniques: distribution alignment and augmentation anchoring. Distribution alignment encourages the marginal distribution of predictions on unlabeled data to be close to the marginal distribution of ground-truth labels. Augmentation anchoring feeds multiple strongly augmented versions of an input into the model and encourages each output to be close to the prediction for a weakly-augmented version of the same input. To produce strong augmentations, we propose a variant of AutoAugment which learns the augmentation policy while the model is being trained. Our new algorithm, dubbed ReMixMatch, is significantly more data-efficient than prior work, requiring between $5\times$ and $16\times$ less data to reach the same accuracy. For example, on CIFAR-10 with 250 labeled examples we reach $93.73\%$ accuracy (compared to MixMatch's accuracy of $93.58\%$ with $4{,}000$ examples) and a median accuracy of $84.92\%$ with just four labels per class. We make our code and data open-source at https://github.com/google-research/remixmatch.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-100 (test) | Accuracy74.82 | 3518 | |
| Image Classification | CIFAR-10 (test) | Accuracy94.86 | 3381 | |
| Image Classification | CIFAR-10 (test) | -- | 906 | |
| Image Classification | CIFAR-100 | -- | 622 | |
| Image Classification | CIFAR10 (test) | Accuracy95.28 | 585 | |
| Image Classification | CIFAR-10 | -- | 507 | |
| Image Classification | CIFAR100 (test) | Top-1 Accuracy76.97 | 377 | |
| Image Classification | SVHN (test) | -- | 362 | |
| Image Classification | SVHN | Accuracy97.4 | 359 | |
| Image Classification | STL-10 (test) | -- | 357 |