DASO: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning
About
The capability of the traditional semi-supervised learning (SSL) methods is far from real-world application due to severely biased pseudo-labels caused by (1) class imbalance and (2) class distribution mismatch between labeled and unlabeled data. This paper addresses such a relatively under-explored problem. First, we propose a general pseudo-labeling framework that class-adaptively blends the semantic pseudo-label from a similarity-based classifier to the linear one from the linear classifier, after making the observation that both types of pseudo-labels have complementary properties in terms of bias. We further introduce a novel semantic alignment loss to establish balanced feature representation to reduce the biased predictions from the classifier. We term the whole framework as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label. We conduct extensive experiments in a wide range of imbalanced benchmarks: CIFAR10/100-LT, STL10-LT, and large-scale long-tailed Semi-Aves with open-set class, and demonstrate that, the proposed DASO framework reliably improves SSL learners with unlabeled data especially when both (1) class imbalance and (2) distribution mismatch dominate.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-10 long-tailed (test) | Top-1 Acc77.9 | 201 | |
| Image Classification | CIFAR-10-LT (test) | -- | 185 | |
| Image Classification | CIFAR100 long-tailed (test) | Accuracy60.6 | 155 | |
| Classification | CIFAR100-LT (test) | Accuracy61.8 | 136 | |
| Image Classification | CIFAR10 long-tailed (test) | Accuracy83.4 | 68 | |
| Image Classification | CIFAR10 LT (test) | Accuracy83.4 | 68 | |
| Image Classification | CIFAR100 LT | Balanced Accuracy60.6 | 57 | |
| Image Classification | CIFAR10-LT | Accuracy83.4 | 48 | |
| Semi-supervised Image Classification | CIFAR100-LT (test) | Accuracy0.606 | 48 | |
| Image Classification | STL10-LT (gamma_l = 10) (test) | Accuracy79 | 42 |