Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning

About

Contrastive self-supervised learning (CSL) has attracted increasing attention for model pre-training via unlabeled data. The resulted CSL models provide instance-discriminative visual features that are uniformly scattered in the feature space. During deployment, the common practice is to directly fine-tune CSL models with cross-entropy, which however may not be the best strategy in practice. Although cross-entropy tends to separate inter-class features, the resulting models still have limited capability for reducing intra-class feature scattering that exists in CSL models. In this paper, we investigate whether applying contrastive learning to fine-tuning would bring further benefits, and analytically find that optimizing the contrastive loss benefits both discriminative representation learning and model optimization during fine-tuning. Inspired by these findings, we propose Contrast-regularized tuning (Core-tuning), a new approach for fine-tuning CSL models. Instead of simply adding the contrastive loss to the objective of fine-tuning, Core-tuning further applies a novel hard pair mining strategy for more effective contrastive fine-tuning, as well as smoothing the decision boundary to better exploit the learned discriminative feature space. Extensive experiments on image classification and semantic segmentation verify the effectiveness of Core-tuning.

Yifan Zhang, Bryan Hooi, Dapeng Hu, Jian Liang, Jiashi Feng• 2021

Related benchmarks

TaskDatasetResultRank
Semantic segmentationPASCAL VOC 2012 (val)
Mean IoU79.62
2040
Image ClassificationCIFAR-100
Top-1 Accuracy84.13
622
Image ClassificationDTD
Accuracy75.37
487
Image ClassificationCIFAR-10--
471
Image ClassificationImageNet
Top-1 Accuracy77.43
429
Image ClassificationAircraft
Accuracy89.48
302
Image ClassificationiNaturalist 2018
Top-1 Accuracy63.57
287
Image ClassificationOxford-IIIT Pets
Accuracy92.36
259
Image ClassificationPACS (test)
Average Accuracy88.08
254
Semantic segmentationPascal VOC (test)
mIoU79.62
236
Showing 10 of 25 rows

Other info

Code

Follow for update