Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition

About

Long-tailed semi-supervised learning poses a significant challenge in training models with limited labeled data exhibiting a long-tailed label distribution. Current state-of-the-art LTSSL approaches heavily rely on high-quality pseudo-labels for large-scale unlabeled data. However, these methods often neglect the impact of representations learned by the neural network and struggle with real-world unlabeled data, which typically follows a different distribution than labeled data. This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning. Our framework derives the class-balanced contrastive loss through Gaussian kernel density estimation. We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels. By progressively estimating the underlying label distribution and optimizing its alignment with model predictions, we tackle the diverse distribution of unlabeled data in real-world scenarios. Extensive experiments across multiple datasets with varying unlabeled data distributions demonstrate that CCL consistently outperforms prior state-of-the-art methods, achieving over 4% improvement on the ImageNet-127 dataset. Our source code is available at https://github.com/zhouzihao11/CCL

Zi-Hao Zhou, Siyuan Fang, Zi-Jing Zhou, Tong Wei, Yuanyu Wan, Min-Ling Zhang• 2024

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-100 Long-Tailed (test)	Top-1 Accuracy67.9	234
Image Classification	CIFAR-10-LT (test)	--	185
Classification	CIFAR100-LT (test)	Accuracy67.9	136
Image Classification	CIFAR10 LT (test)	Accuracy86.2	106
Image Classification	STL10-LT (test)	Accuracy84.8	36
Image Classification	ImageNet127	Accuracy67.8	21
Image Classification	Small-ImageNet-127 size 32x32 and 64x64 (test)	Accuracy67.8	18

Showing 7 of 7 rows

Other info

Code

Follow for update

@wizwand_team Discord