Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ScanMix: Learning from Severe Label Noise via Semantic Clustering and Semi-Supervised Learning

About

We propose a new training algorithm, ScanMix, that explores semantic clustering and semi-supervised learning (SSL) to allow superior robustness to severe label noise and competitive robustness to non-severe label noise problems, in comparison to the state of the art (SOTA) methods. ScanMix is based on the expectation maximisation framework, where the E-step estimates the latent variable to cluster the training images based on their appearance and classification results, and the M-step optimises the SSL classification and learns effective feature representations via semantic clustering. We present a theoretical result that shows the correctness and convergence of ScanMix, and an empirical result that shows that ScanMix has SOTA results on CIFAR-10/-100 (with symmetric, asymmetric and semantic label noise), Red Mini-ImageNet (from the Controlled Noisy Web Labels), Clothing1M and WebVision. In all benchmarks with severe label noise, our results are competitive to the current SOTA.

Ragav Sachdeva, Filipe R Cordeiro, Vasileios Belagiannis, Ian Reid, Gustavo Carneiro• 2021

Related benchmarks

TaskDatasetResultRank
Image ClassificationClothing1M (test)
Accuracy74.35
546
Image ClassificationILSVRC 2012 (test)
Top-1 Acc75.76
117
Image ClassificationCIFAR-100 (test)
Accuracy (Symmetric 20%)77
72
Image ClassificationWebvision (test)
Acc80.04
57
Image ClassificationRed Mini-ImageNet (test)
Accuracy59.06
51
Image ClassificationCIFAR-10 (test)
Accuracy (Sym, 20%)96
22
Image ClassificationCIFAR-10 semantic asymmetric noise (test)
Accuracy89.96
21
Image ClassificationCIFAR-100 semantic noise (test)
Accuracy68.44
21
Showing 8 of 8 rows

Other info

Follow for update