Task Adaptive Parameter Sharing for Multi-Task Learning

About

Adapting pre-trained models with broad capabilities has become standard practice for learning a wide range of downstream tasks. The typical approach of fine-tuning different models for each task is performant, but incurs a substantial memory cost. To efficiently learn multiple downstream tasks we introduce Task Adaptive Parameter Sharing (TAPS), a general method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers. This enables multi-task learning while minimizing resources used and competition between tasks. TAPS solves a joint optimization problem which determines which layers to share with the base model and the value of the task-specific weights. Further, a sparsity penalty on the number of active layers encourages weight sharing with the base model. Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters. Moreover, TAPS is agnostic to the model architecture and requires only minor changes to the training scheme. We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.

Matthew Wallingford, Hao Li, Alessandro Achille, Avinash Ravichandran, Charless Fowlkes, Rahul Bhotika, Stefano Soatto• 2022

Related benchmarks

Task	Dataset	Result
Classification	Cars	Accuracy89.76	492
Image Classification	CUB	Accuracy82.65	331
Image Classification	Flowers	Accuracy96.68	135
Image Classification	Visual Decathlon Challenge 1.0 (test)	Mean Accuracy78.7	81
Image Classification	Sketch	--	20
Incremental Multi-Task Learning	DomainNet	Accuracy (Real)80.28	4
Joint Multi-Task Learning	DomainNet	Real Accuracy78.91	3

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord