Less is More: Efficient Model Merging with Binary Task Switch

About

As an effective approach to equip models with multi-task capabilities without additional training, model merging has garnered significant attention. However, existing methods face challenges of redundant parameter conflicts and the excessive storage burden of parameters. In this work, through controlled experiments, we reveal that for task vectors, only those parameters with magnitudes above a certain threshold contribute positively to the task, exhibiting a pulse-like characteristic. We then attempt leveraging this characteristic to binarize the task vectors and reduce storage overhead. Further controlled experiments show that the binarized task vectors incur almost no decrease in fine-tuning and merging performance, and even exhibit stronger performance improvements as the proportion of redundant parameters increases. Based on these insights, we propose Task Switch (T-Switch), which decomposes task vectors into three components: 1) an activation switch instantiated by a binarized mask vector, 2) a polarity switch instantiated by a binarized sign vector, and 3) a scaling knob instantiated by a scalar coefficient. By storing task vectors in a binarized form, T-Switch alleviates parameter conflicts while ensuring efficient task parameter storage. Furthermore, to enable automated switch combination in T-Switch, we further introduce Auto-Switch, which enables training-free switch combination via retrieval from a small query set. Experiments indicate that our methods achieve significant performance improvements over existing baselines, requiring only 1-3% of the storage space of full-precision parameters.

Biqing Qi, Fangyuan Li, Zhen Wang, Junqi Gao, Dong Li, Peng Ye, Bowen Zhou• 2024

Related benchmarks

Task	Dataset	Result
Multi-task Language Understanding	MMLU	Accuracy68.05	353
Bias Evaluation	BBQ	Accuracy80.69	171
Image Classification	Vision Multi-task Suite (SUN397, Cars, RESISC45, EuroSAT, SVHN, GTSRB, MNIST, DTD)	Average Accuracy94.16	104
Visual Classification	8 Vision Tasks (SUN397, Stanford Cars, RESISC45, EuroSAT, SVHN, GTSRB, MNIST, DTD)	Average Accuracy90.15	86
Truthfulness	TruthfulQA	Truthfulness Accuracy53.72	86
Image Classification	SUN397, Cars, EuroSAT, GTSRB, MNIST, DTD Seen Tasks (test)	SUN397 Accuracy0.8184	34
Image Classification	RESISC45, SVHN Unseen Tasks (test)	RESISC45 Accuracy71.33	34
Question Answering	MMLU, TruthfulQA, and BBQ	MMLU Accuracy68.05	21
Natural Language Understanding	GLUE	CoLA76.27	16
Natural Language Understanding	GLUE RoBERTa-base (val)	CoLA Score53.12	16

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord