Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Robust fine-tuning of zero-shot models

About

Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset). Although existing fine-tuning methods substantially improve accuracy on a given target distribution, they often reduce robustness to distribution shifts. We address this tension by introducing a simple and effective method for improving robustness while fine-tuning: ensembling the weights of the zero-shot and fine-tuned models (WiSE-FT). Compared to standard fine-tuning, WiSE-FT provides large accuracy improvements under distribution shift, while preserving high accuracy on the target distribution. On ImageNet and five derived distribution shifts, WiSE-FT improves accuracy under distribution shift by 4 to 6 percentage points (pp) over prior work while increasing ImageNet accuracy by 1.6 pp. WiSE-FT achieves similarly large robustness gains (2 to 23 pp) on a diverse set of six further distribution shifts, and accuracy gains of 0.8 to 3.3 pp compared to standard fine-tuning on seven commonly used transfer learning datasets. These improvements come at no additional computational cost during fine-tuning or inference.

Mitchell Wortsman, Gabriel Ilharco, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gontijo-Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, Ludwig Schmidt• 2021

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet-1K 1.0 (val)
Top-1 Accuracy87.1
1866
Object Hallucination EvaluationPOPE--
935
Image ClassificationImageNet 1k (test)
Top-1 Accuracy85.33
798
Image ClassificationCIFAR10 (test)
Accuracy99.5
585
Image ClassificationImageNet A
Top-1 Acc81
553
Image ClassificationEuroSAT
Accuracy73.6
497
Image ClassificationImageNet V2
Top-1 Acc79.5
487
Image ClassificationFlowers102
Accuracy6.6
478
Image ClassificationStanford Cars
Accuracy63.3
477
Image ClassificationImageNet-R
Top-1 Acc90.3
474
Showing 10 of 128 rows
...

Other info

Code

Follow for update