Divide and Contrast: Self-supervised Learning from Uncurated Data

About

Self-supervised learning holds promise in leveraging large amounts of unlabeled data, however much of its progress has thus far been limited to highly curated pre-training data such as ImageNet. We explore the effects of contrastive learning from larger, less-curated image datasets such as YFCC, and find there is indeed a large difference in the resulting representation quality. We hypothesize that this curation gap is due to a shift in the distribution of image classes -- which is more diverse and heavy-tailed -- resulting in less relevant negative samples to learn from. We test this hypothesis with a new approach, Divide and Contrast (DnC), which alternates between contrastive learning and clustering-based hard negative mining. When pretrained on less curated datasets, DnC greatly improves the performance of self-supervised learning on downstream tasks, while remaining competitive with the current state-of-the-art on curated datasets.

Yonglong Tian, Olivier J. Henaff, Aaron van den Oord• 2021

Related benchmarks

Task	Dataset	Result
Semantic segmentation	ADE20K (val)	mIoU39.2	3069
Image Classification	ImageNet-1k (val)	Top-1 Accuracy75.8	1498
Video Object Segmentation	DAVIS 2017 (val)	J mean63.1	1226
Semantic segmentation	ADE20K	mIoU39.2	1028
Object Detection	COCO (val)	mAP43.9	637
Action Recognition	UCF101 (test)	--	357
Classification	CIFAR10 (test)	Accuracy91.7	331
Image Classification	FGVC-Aircraft (test)	Accuracy54.1	322
Image Classification	Stanford Cars (test)	Accuracy75.3	320
Instance Segmentation	COCO	APmask37.2	291

Showing 10 of 37 rows

Other info

Code

Follow for update

@wizwand_team Discord