MV-MR: multi-views and multi-representations for self-supervised learning and knowledge distillation

About

We present a new method of self-supervised learning and knowledge distillation based on the multi-views and multi-representations (MV-MR). The MV-MR is based on the maximization of dependence between learnable embeddings from augmented and non-augmented views, jointly with the maximization of dependence between learnable embeddings from augmented view and multiple non-learnable representations from non-augmented view. We show that the proposed method can be used for efficient self-supervised classification and model-agnostic knowledge distillation. Unlike other self-supervised techniques, our approach does not use any contrastive learning, clustering, or stop gradients. MV-MR is a generic framework allowing the incorporation of constraints on the learnable embeddings via the usage of image multi-representations as regularizers. Along this line, knowledge distillation is considered a particular case of such a regularization. MV-MR provides the state-of-the-art performance on the STL10 and ImageNet-1K datasets among non-contrastive and clustering-free methods. We show that a lower complexity ResNet50 model pretrained using proposed knowledge distillation based on the CLIP ViT model achieves state-of-the-art performance on STL10 linear evaluation. The code is available at: https://github.com/vkinakh/mv-mr

Vitaliy Kinakh, Mariia Drozdova, Slava Voloshynovskiy• 2023

Related benchmarks

Task	Dataset	Result
Image Classification	STL-10 (test)	Accuracy95.6	364
Image Classification	VOC 2007 (test)	mAP87.1	67
Image Classification	CIFAR-20	Mean Accuracy73.2	52
Image Classification	ImageNet-1k (val)	Top-1 Accuracy74.5	34

Showing 4 of 4 rows

Other info

Code

Follow for update

@wizwand_team Discord