Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification

About

Contrastive Vision-Language Pre-training, known as CLIP, has provided a new paradigm for learning visual representations using large-scale image-text pairs. It shows impressive performance on downstream tasks by zero-shot knowledge transfer. To further enhance CLIP's adaption capability, existing methods proposed to fine-tune additional learnable modules, which significantly improves the few-shot performance but introduces extra training time and computational resources. In this paper, we propose a training-free adaption method for CLIP to conduct few-shot classification, termed as Tip-Adapter, which not only inherits the training-free advantage of zero-shot CLIP but also performs comparably to those training-required approaches. Tip-Adapter constructs the adapter via a key-value cache model from the few-shot training set, and updates the prior knowledge encoded in CLIP by feature retrieval. On top of that, the performance of Tip-Adapter can be further boosted to be state-of-the-art on ImageNet by fine-tuning the cache model for 10$\times$ fewer epochs than existing methods, which is both effective and efficient. We conduct extensive experiments of few-shot classification on 11 datasets to demonstrate the superiority of our proposed methods. Code is released at https://github.com/gaopengcuhk/Tip-Adapter.

Renrui Zhang, Zhang Wei, Rongyao Fang, Peng Gao, Kunchang Li, Jifeng Dai, Yu Qiao, Hongsheng Li• 2022

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet 1k (test)
Top-1 Accuracy73.3
848
Image ClassificationStanford Cars
Accuracy82.3
635
Image ClassificationImageNet V2
Top-1 Acc57.11
611
Image ClassificationEuroSAT
Accuracy70.5
569
Image ClassificationFlowers102
Accuracy89.9
558
Image ClassificationDTD
Accuracy66.94
542
Image ClassificationFood101
Accuracy86.8
457
Image ClassificationUCF101
Top-1 Acc83.9
455
Image ClassificationSUN397
Accuracy76
441
Image ClassificationImageNet-Sketch
Top-1 Accuracy36
407
Showing 10 of 94 rows
...

Other info

Follow for update