Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Debiased Learning from Naturally Imbalanced Pseudo-Labels

About

Pseudo-labels are confident predictions made on unlabeled target data by a classifier trained on labeled source data. They are widely used for adapting a model to unlabeled data, e.g., in a semi-supervised learning setting. Our key insight is that pseudo-labels are naturally imbalanced due to intrinsic data similarity, even when a model is trained on balanced source data and evaluated on balanced target data. If we address this previously unknown imbalanced classification problem arising from pseudo-labels instead of ground-truth training labels, we could remove model biases towards false majorities created by pseudo-labels. We propose a novel and effective debiased learning method with pseudo-labels, based on counterfactual reasoning and adaptive margins: The former removes the classifier response bias, whereas the latter adjusts the margin of each class according to the imbalance of pseudo-labels. Validated by extensive experimentation, our simple debiased learning delivers significant accuracy gains over the state-of-the-art on ImageNet-1K: 26% for semi-supervised learning with 0.2% annotations and 9% for zero-shot learning. Our code is available at: https://github.com/frank-xwang/debiased-pseudo-labeling.

Xudong Wang, Zhirong Wu, Long Lian, Stella X. Yu• 2022

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet 1k (test)
Top-1 Accuracy68.3
798
Image ClassificationCIFAR-10-LT gamma=100 (test)--
35
Medical Image SegmentationAMOS 5% labeled
Mean Dice41.97
29
Image ClassificationImageNet-1K 1.0 (1% labels)
Top-1 Acc70.9
28
Semi-supervised medical image segmentationSynapse (20% labeled)
Average Dice Score36.27
27
Image ClassificationFive Datasets 8-shot
Accuracy67.6
18
Image ClassificationFive Datasets 16-shot
Accuracy73.2
18
Image ClassificationFive Datasets 4-shot
Accuracy0.603
18
Multi-organ SegmentationSynapse 20% labeled data (test)
Avg. Dice36.27
16
Image ClassificationImageNet-1K 0.2% labels 1.0
Top-1 Acc69.6
7
Showing 10 of 17 rows

Other info

Code

Follow for update