ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale

About

Multi-task learning (MTL) has shown considerable practical benefits, particularly when using language models (LMs). While this is commonly achieved by learning $n$ tasks under a joint optimization procedure, some methods, such as AdapterFusion, divide the problem into two stages: (i) task learning, where knowledge specific to a task is encapsulated within sets of parameters (e.g., adapters), and (ii) transfer, where this already learned knowledge is leveraged for a target task. This separation of concerns provides numerous benefits (e.g., promoting reusability). However, current two-stage MTL introduces a substantial number of additional parameters. We address this issue by leveraging the usefulness of linearly scaling the output representations of source adapters for transfer learning. We introduce ScaLearn, a simple and highly parameter-efficient two-stage MTL method that capitalizes on the knowledge of the source tasks by learning a minimal set of scaling parameters that enable effective transfer to a target task. Our experiments on three benchmarks (GLUE, SuperGLUE, and HumSet) and two encoder LMs show that ScaLearn consistently outperforms strong baselines with a small number of transfer parameters (~ $0.35$% of those of AdapterFusion). Remarkably, we observe that ScaLearn maintains its strong abilities even when further reducing parameters, achieving competitive results with only $8$ transfer parameters per target task. Our proposed approach thus demonstrates the power of simple scaling as a promise for more efficient task transfer.

Markus Frohmann, Carolin Holtermann, Shahed Masoudian, Anne Lauscher, Navid Rekabsaz• 2023

Related benchmarks

Task	Dataset	Result
Natural Language Understanding	SuperGLUE	SGLUE Score75.74	84
General Language Understanding	GLUE v1 (test dev)	MNLI87.06	40
Natural Language Understanding	GLUE and SuperGLUE (test val)	SST-295.7	37
Natural Language Understanding	GLUE RoBERTa LARGE (test dev)	MNLI Accuracy90.31	22
Classification	HumSet XLM-RBASE (test)	Sectors Score72.38	17
Multilingual Multi-label Text Classification	HumSet (test)	Sectors73.32	17
Natural Language Understanding	SuperGLUE RoBERTa-large (test)	ReCoRD88.85	17

Showing 7 of 7 rows

Other info

Code

Follow for update

@wizwand_team Discord