Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

About

Model merging is an efficient empowerment technique in the machine learning community that does not require the collection of raw training data and does not require expensive computation. As model merging becomes increasingly prevalent across various fields, it is crucial to understand the available model merging techniques comprehensively. However, there is a significant gap in the literature regarding a systematic and thorough review of these techniques. This survey provides a comprehensive overview of model merging methods and theories, their applications in various domains and settings, and future research directions. Specifically, we first propose a new taxonomic approach that exhaustively discusses existing model merging methods. Secondly, we discuss the application of model merging techniques in large language models, multimodal large language models, and more than ten machine learning subfields, including continual learning, multi-task learning, few-shot learning, etc. Finally, we highlight the remaining challenges of model merging and discuss future research directions. A comprehensive list of papers about model merging is available at https://github.com/EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications.

Enneng Yang, Li Shen, Guibing Guo, Xingwei Wang, Xiaochun Cao, Jie Zhang, Dacheng Tao• 2024

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval	--	1043
Instruction Following	IFEval	IFEval Accuracy42.7	836
Instruction Following	AlpacaEval	Win Rate11.51	420
Code Generation	MBPP	Accuracy (%)52.16	146
Instruction Following	IFEval (test)	IFEval Score39.37	88
Helpfulness	Alpaca Eval	Alpaca Eval (%)12.28	22
Code Generation	MBPP	MBPP Accuracy48.56	22
Harmlessness	Toxigen	Toxigen (%)99.96	17
Model Merging	Vision, Language, and Multi-modal tasks	Parameters1	11
LLM Alignment	Combined Suite Setup 3	Average Percentage Score51.5	9

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord