CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning
About
Semi-supervised learning on class-imbalanced data, although a realistic problem, has been under studied. While existing semi-supervised learning (SSL) methods are known to perform poorly on minority classes, we find that they still generate high precision pseudo-labels on minority classes. By exploiting this property, in this work, we propose Class-Rebalancing Self-Training (CReST), a simple yet effective framework to improve existing SSL methods on class-imbalanced data. CReST iteratively retrains a baseline SSL model with a labeled set expanded by adding pseudo-labeled samples from an unlabeled set, where pseudo-labeled samples from minority classes are selected more frequently according to an estimated class distribution. We also propose a progressive distribution alignment to adaptively adjust the rebalancing strength dubbed CReST+. We show that CReST and CReST+ improve state-of-the-art SSL algorithms on various class-imbalanced datasets and consistently outperform other popular rebalancing methods. Code has been made available at https://github.com/google-research/crest.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-10 long-tailed (test) | Top-1 Acc76.3 | 201 | |
| Image Classification | CIFAR-10-LT (test) | -- | 185 | |
| Image Classification | CIFAR100 long-tailed (test) | Accuracy57.4 | 155 | |
| Classification | CIFAR100-LT (test) | Accuracy59.2 | 136 | |
| Image Classification | CIFAR10 long-tailed (test) | Accuracy81.1 | 68 | |
| Image Classification | CIFAR10 LT (test) | Accuracy81.1 | 68 | |
| Image Classification | CIFAR100 LT | Balanced Accuracy57.4 | 57 | |
| Image Classification | CIFAR-100 Long-Tailed (test) | Balanced Accuracy52.9 | 51 | |
| Image Classification | CIFAR10-LT | Accuracy81.1 | 48 | |
| Semi-supervised Image Classification | CIFAR100-LT (test) | Accuracy0.574 | 48 |