Twin Contrastive Learning with Noisy Labels
About
Learning from noisy data is a challenging task that significantly degenerates the model performance. In this paper, we present TCL, a novel twin contrastive learning model to learn robust representations and handle noisy labels for classification. Specifically, we construct a Gaussian mixture model (GMM) over the representations by injecting the supervised model predictions into GMM to link label-free latent variables in GMM with label-noisy annotations. Then, TCL detects the examples with wrong labels as the out-of-distribution examples by another two-component GMM, taking into account the data distribution. We further propose a cross-supervision with an entropy regularization loss that bootstraps the true targets from model predictions to handle the noisy labels. As a result, TCL can learn discriminative representations aligned with estimated labels through mixup and contrastive learning. Extensive experimental results on several standard benchmarks and real-world datasets demonstrate the superior performance of TCL. In particular, TCL achieves 7.5\% improvements on CIFAR-10 with 90\% noisy label -- an extremely noisy scenario. The source code is available at \url{https://github.com/Hzzone/TCL}.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-100 (test) | Accuracy78 | 3518 | |
| Image Classification | Clothing1M (test) | Accuracy74.8 | 546 | |
| Image Classification | ILSVRC 2012 (val) | Top-1 Accuracy75.4 | 156 | |
| Image Classification | ILSVRC 2012 (test) | Top-1 Acc75.4 | 117 | |
| Image Classification | WebVision mini (val) | Top-1 Accuracy79.1 | 78 | |
| Image Classification | CIFAR10 (test) | Accuracy92.68 | 76 | |
| Image Classification | CIFAR-100 (test) | Accuracy (Symmetric 20%)78 | 72 | |
| Image Classification | CIFAR-10 (test) | Accuracy95 | 68 | |
| Image Classification | Webvision (test) | -- | 57 | |
| Image Classification | CIFAR-10 40% asymmetric noise | Accuracy93.7 | 27 |