Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Newer is not always better: Rethinking transferability metrics, their peculiarities, stability and performance

About

Fine-tuning of large pre-trained image and language models on small customized datasets has become increasingly popular for improved prediction and efficient use of limited resources. Fine-tuning requires identification of best models to transfer-learn from and quantifying transferability prevents expensive re-training on all of the candidate models/tasks pairs. In this paper, we show that the statistical problems with covariance estimation drive the poor performance of H-score -- a common baseline for newer metrics -- and propose shrinkage-based estimator. This results in up to 80% absolute gain in H-score correlation performance, making it competitive with the state-of-the-art LogME measure. Our shrinkage-based H-score is $3\times$-10$\times$ faster to compute compared to LogME. Additionally, we look into a less common setting of target (as opposed to source) task selection. We demonstrate previously overlooked problems in such settings with different number of labels, class-imbalance ratios etc. for some recent metrics e.g., NCE, LEEP that resulted in them being misrepresented as leading measures. We propose a correction and recommend measuring correlation performance against relative accuracy in such settings. We support our findings with ~164,000 (fine-tuning trials) experiments on both vision models and graph neural networks.

Shibal Ibrahim, Natalia Ponomareva, Rahul Mazumder• 2021

Related benchmarks

TaskDatasetResultRank
Semantic segmentationNuclei LM Target
Ktau0.35
21
Semantic segmentationMito EM (target datasets)
Ktau0.24
21
Transferability EstimationOfficeHome v1 (test)
Mean Spearman Correlation19.61
16
Transferability PredictionOfficeHome
MCI26.95
16
Transferability PredictionPACS
MCI (%pt.)-48.89
16
Sentiment AnalysisEuroEval 34
MCI (%pt.)10.45
16
Transferability EstimationDomainNet
MCI (%pt.)17.11
16
Transferability EstimationImageNet-C Severity 1 1.0
Mean Correlation Improvement (MCI)12.44
16
Transferability EstimationDomainNet v1 (test)
Mean Spearman Correlation-17.5
16
Transfer Performance PredictionEuroEval (test)
Mean Spearman Correlation5.1
16
Showing 10 of 15 rows

Other info

Follow for update