Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Generalized Logit Adjustment: Calibrating Fine-tuned Models by Removing Label Bias in Foundation Models

About

Foundation models like CLIP allow zero-shot transfer on various tasks without additional training data. Yet, the zero-shot performance is less competitive than a fully supervised one. Thus, to enhance the performance, fine-tuning and ensembling are also commonly adopted to better fit the downstream tasks. However, we argue that such prior work has overlooked the inherent biases in foundation models. Due to the highly imbalanced Web-scale training set, these foundation models are inevitably skewed toward frequent semantics, and thus the subsequent fine-tuning or ensembling is still biased. In this study, we systematically examine the biases in foundation models and demonstrate the efficacy of our proposed Generalized Logit Adjustment (GLA) method. Note that bias estimation in foundation models is challenging, as most pre-train data cannot be explicitly accessed like in traditional long-tailed classification tasks. To this end, GLA has an optimization-based bias estimation approach for debiasing foundation models. As our work resolves a fundamental flaw in the pre-training, the proposed GLA demonstrates significant improvements across a diverse range of tasks: it achieves 1.5 pp accuracy gains on ImageNet, an large average improvement (1.4-4.6 pp) on 11 few-shot datasets, 2.4 pp gains on long-tailed classification. Codes are in https://github.com/BeierZhu/GLA.

Beier Zhu, Kaihua Tang, Qianru Sun, Hanwang Zhang• 2023

Related benchmarks

TaskDatasetResultRank
Fine-grained Image ClassificationStanford Cars (test)
Accuracy91.1
348
Image ClassificationImageNet V2 (test)
Top-1 Accuracy57.1
181
Image ClassificationImageNet-A (test)--
154
Image ClassificationCIFAR100 (test)
Test Accuracy91.9
147
Image ClassificationImageNet-Sketch (test)
Top-1 Acc0.364
132
Image ClassificationImageNet-R (test)
Accuracy62.1
105
Long-tailed recognitionPlaces-LT (test)
Accuracy (Overall)47.2
71
Scene recognitionSUN397
Accuracy83.4
34
Scene recognitionSUN 397 (test)
Top-1 Accuracy83.4
29
Object ClassificationImageNet (test)
Top-1 Acc82.8
15
Showing 10 of 12 rows

Other info

Code

Follow for update