Robust Contrastive Learning against Noisy Views

About

Contrastive learning relies on an assumption that positive pairs contain related views, e.g., patches of an image or co-occurring multimodal signals of a video, that share certain underlying information about an instance. But what if this assumption is violated? The literature suggests that contrastive learning produces suboptimal representations in the presence of noisy views, e.g., false positive pairs with no apparent shared information. In this work, we propose a new contrastive loss function that is robust against noisy views. We provide rigorous theoretical justifications by showing connections to robust symmetric losses for noisy binary classification and by establishing a new contrastive bound for mutual information maximization based on the Wasserstein distance measure. The proposed loss is completely modality-agnostic and a simple drop-in replacement for the InfoNCE loss, which makes it easy to apply to existing contrastive frameworks. We show that our approach provides consistent improvements over the state-of-the-art on image, video, and graph contrastive learning benchmarks that exhibit a variety of real-world noise patterns.

Ching-Yao Chuang, R Devon Hjelm, Xin Wang, Vibhav Vineet, Neel Joshi, Antonio Torralba, Stefanie Jegelka, Yale Song• 2022

Related benchmarks

Task	Dataset	Result
Image Classification	ImageNet-1k (val)	Top-1 Accuracy74.2	1498
Text-to-Image Retrieval	Flickr30K	R@154.9	559
Image-to-Text Retrieval	Flickr30K	R@172.1	451
Text-to-Image Retrieval	MS-COCO	R@161.7	187
Image-to-Text Retrieval	MS-COCO	R@173.8	168
Action Recognition	UCF101 1 (test)	Accuracy88.8	50
Text-to-Image Retrieval	CC152K	R@137.6	48
Image-to-Text Retrieval	CC152K	R@135.9	48
Graph Classification	PROTEINS TUDataset	Accuracy74.7	44
Graph Classification	NCI1 TUDataset	Accuracy78.6	44

Showing 10 of 13 rows

Other info

Code

Follow for update

@wizwand_team Discord