Iterative Learning with Open-set Noisy Labels

About

Large-scale datasets possessing clean label annotations are crucial for training Convolutional Neural Networks (CNNs). However, labeling large-scale data can be very costly and error-prone, and even high-quality datasets are likely to contain noisy (incorrect) labels. Existing works usually employ a closed-set assumption, whereby the samples associated with noisy labels possess a true class contained within the set of known classes in the training data. However, such an assumption is too restrictive for many applications, since samples associated with noisy labels might in fact possess a true class that is not present in the training data. We refer to this more complex scenario as the \textbf{open-set noisy label} problem and show that it is nontrivial in order to make accurate predictions. To address this problem, we propose a novel iterative learning framework for training CNNs on datasets with open-set noisy labels. Our approach detects noisy labels and learns deep discriminative features in an iterative fashion. To benefit from the noisy label detection, we design a Siamese network to encourage clean labels and noisy labels to be dissimilar. A reweighting module is also applied to simultaneously emphasize the learning from clean labels and reduce the effect caused by noisy labels. Experiments on CIFAR-10, ImageNet and real-world noisy (web-search) datasets demonstrate that our proposed model can robustly train CNNs in the presence of a high proportion of open-set as well as closed-set noisy labels.

Yisen Wang, Weiyang Liu, Xingjun Ma, James Bailey, Hongyuan Zha, Le Song, Shu-Tao Xia• 2018

Related benchmarks

Task	Dataset	Result
Image Classification	ImageNet ILSVRC-2012 (val)	Top-1 Accuracy61.6	441
Image Classification	ILSVRC 2012 (val)	Top-1 Accuracy61.6	156
Image Classification	WebVision mini (val)	Top-1 Accuracy65.24	78
Image Classification	CIFAR-100 (test)	--	72
Image Classification	WebVision (val)	Top-1 Acc65.24	57
Image Classification	T-ImageNet (test)	--	37
Image Classification	noise padded CIFAR-10 (test)	Test Accuracy78.15	21
Image Classification	CIFAR-10 (IND) + Places-365 (OOD)	Test Acc76.36	20
Image Classification	CIFAR-10 (test)	Accuracy (Sym. Noise Rate 0.2)87.9	15
Image Classification	CIFAR-10 (test)	--	14

Showing 10 of 22 rows

Other info

Follow for update

@wizwand_team Discord