Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Let Go of Your Labels with Unsupervised Transfer

About

Foundation vision-language models have enabled remarkable zero-shot transferability of the pre-trained representations to a wide range of downstream tasks. However, to solve a new task, zero-shot transfer still necessitates human guidance to define visual categories that appear in the data. Here, we show that fully unsupervised transfer emerges when searching for the labeling of a dataset that induces maximal margin classifiers in representation spaces of different foundation models. We present TURTLE, a fully unsupervised method that effectively employs this guiding principle to uncover the underlying labeling of a downstream dataset without any supervision and task-specific representation learning. We evaluate TURTLE on a diverse benchmark suite of 26 datasets and show that it achieves new state-of-the-art unsupervised performance. Furthermore, TURTLE, although being fully unsupervised, outperforms zero-shot transfer baselines on a wide range of datasets. In particular, TURTLE matches the average performance of CLIP zero-shot on 26 datasets by employing the same representation space, spanning a wide range of architectures and model sizes. By guiding the search for the underlying labeling using the representation spaces of two foundation models, TURTLE surpasses zero-shot transfer and unsupervised prompt tuning baselines, demonstrating the surprising power and effectiveness of unsupervised transfer.

Artyom Gadetsky, Yulun Jiang, Maria Brbic• 2024

Related benchmarks

TaskDatasetResultRank
Image ClassificationFood-101
Accuracy92.2
494
Image ClusteringCIFAR-10
NMI0.929
243
Image ClassificationImageNet
Accuracy72.9
184
ClusteringMNIST (test)--
122
ClusteringCIFAR-100 (test)
ACC89.1
110
ClusteringFashion MNIST
NMI72.3
95
ClusteringCIFAR100
Clustering Accuracy80.6
11
ClusteringMNIST
ACC0.573
5
ClusteringFood101 (test)
Accuracy92.8
4
ClusteringBirdsnap (test)
Accuracy67.8
4
Showing 10 of 11 rows

Other info

Code

Follow for update