Scalable Transfer Learning with Expert Models

About

Transfer of pre-trained representations can improve sample efficiency and reduce computational requirements for new tasks. However, representations used for transfer are usually generic, and are not tailored to a particular distribution of downstream tasks. We explore the use of expert representations for transfer with a simple, yet effective, strategy. We train a diverse set of experts by exploiting existing label structures, and use cheap-to-compute performance proxies to select the relevant expert for each target task. This strategy scales the process of transferring to new tasks, since it does not revisit the pre-training data during transfer. Accordingly, it requires little extra compute per target task, and results in a speed-up of 2-3 orders of magnitude compared to competing approaches. Further, we provide an adapter-based architecture able to compress many experts into a single model. We evaluate our approach on two different data sources and demonstrate that it outperforms baselines on over 20 diverse vision tasks in both cases.

Joan Puigcerver, Carlos Riquelme, Basil Mustafa, Cedric Renggli, Andr\'e Susano Pinto, Sylvain Gelly, Daniel Keysers, Neil Houlsby• 2020

Related benchmarks

Task	Dataset	Result
Image Classification	Stanford Cars	Accuracy96.1	660
Image Classification	Food-101	Accuracy93.1	570
Image Classification	CIFAR-10	Accuracy97.9	564
Classification	Cars	Accuracy96.4	492
Image Classification	Aircraft	Accuracy94.8	340
Image Classification	Pets	--	308
Image Classification	FGVC Aircraft	--	223
Image Classification	Food	Accuracy93.1	152
Image Classification	VTAB 1k (test)	Accuracy (Natural)80.2	121
Image Classification	Bird	Accuracy84.3	29

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord