Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer

About

Foundation models and their checkpoints have significantly advanced deep learning, boosting performance across various applications. However, fine-tuned models often struggle outside their specific domains and exhibit considerable redundancy. Recent studies suggest that combining a pruned fine-tuned model with the original pre-trained model can mitigate forgetting, reduce interference when merging model parameters across tasks, and improve compression efficiency. In this context, developing an effective pruning strategy for fine-tuned models is crucial. Leveraging the advantages of the task vector mechanism, we preprocess fine-tuned models by calculating the differences between them and the original model. Recognizing that different task vector subspaces contribute variably to model performance, we introduce a novel method called Neural Parameter Search (NPS-Pruning) for slimming down fine-tuned models. This method enhances pruning efficiency by searching through neural parameters of task vectors within low-rank subspaces. Our method has three key applications: enhancing knowledge transfer through pairwise model interpolation, facilitating effective knowledge fusion via model merging, and enabling the deployment of compressed models that retain near-original performance while significantly reducing storage costs. Extensive experiments across vision, NLP, and multi-modal benchmarks demonstrate the effectiveness and robustness of our approach, resulting in substantial performance gains. The code is publicly available at: https://github.com/duguodong7/NPS-Pruning.

Guodong Du, Zitao Fang, Jing Li, Junlin Li, Runhua Jiang, Shuyang Yu, Yifei Guo, Yangneng Chen, Sim Kuan Goh, Ho-Kin Tang, Daojing He, Honghai Liu, Min Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Image Classification8 Vision Tasks (test)
Avg Accuracy87.6
82
Natural Language Processing7 NLP Tasks (test)
Average Accuracy82.1
38
Image Classification8 Vision Tasks Aggregate ViT-B/32 ViT-L/14 (average)
Accuracy94.3
31
Emotion Classification5 Emotion Domains (test)
Average Accuracy40.9
20
Natural Language Processing7 NLP Tasks Aggregate T5-Base T5-Large (average)
Accuracy (%)88.8
12
Parameter-Efficient Fine-Tuning11 PEFT Tasks (test)
Avg Accuracy68.2
10
Large Language Model Reasoning3 LLM Tasks (CMMLU, GSM8K, HumanEval) (test)
Average Accuracy35.3
7
Large Language Modeling3 LLM Tasks Aggregate LLaMa2 (average)
Accuracy0.405
6
Knowledge TransferLLaVA-1.5 Evaluation Suite OKVQA
VQAv2 Accuracy76.81
6
Knowledge TransferLLaVA Evaluation Suite Flickr30k 1.5
VQAv2 Accuracy74.3
6
Showing 10 of 10 rows

Other info

Code

Follow for update