Unsupervised Label Noise Modeling and Loss Correction

About

Despite being robust to small amounts of label noise, convolutional neural networks trained with stochastic gradient methods have been shown to easily fit random labels. When there are a mixture of correct and mislabelled targets, networks tend to fit the former before the latter. This suggests using a suitable two-component mixture model as an unsupervised generative model of sample loss values during training to allow online estimation of the probability that a sample is mislabelled. Specifically, we propose a beta mixture to estimate this probability and correct the loss by relying on the network prediction (the so-called bootstrapping loss). We further adapt mixup augmentation to drive our approach a step further. Experiments on CIFAR-10/100 and TinyImageNet demonstrate a robustness to label noise that substantially outperforms recent state-of-the-art. Source code is available at https://git.io/fjsvE

Eric Arazo, Diego Ortego, Paul Albert, Noel E. O'Connor, Kevin McGuinness• 2019

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-100 (test)	Accuracy73.9	3518
Image Classification	CIFAR-10 (test)	Accuracy94.5	3381
Image Classification	CIFAR-10 (test)	Accuracy93.8	906
Image Classification	CIFAR-100 (val)	Accuracy78.64	781
Image Classification	CIFAR-100	Accuracy73.9	691
Image Classification	Clothing1M (test)	Accuracy71	598
Image Classification	CIFAR-10	Accuracy94	564
Image Classification	TinyImageNet (test)	Accuracy60	499
Image Classification	ImageNet ILSVRC-2012 (val)	Top-1 Accuracy68.52	441
Image Classification	CIFAR-10 (val)	Top-1 Accuracy94	377

Showing 10 of 73 rows

...

Other info

Code

Follow for update

@wizwand_team Discord