Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer
About
Foundation models and their checkpoints have significantly advanced deep learning, boosting performance across various applications. However, fine-tuned models often struggle outside their specific domains and exhibit considerable redundancy. Recent studies suggest that combining a pruned fine-tuned model with the original pre-trained model can mitigate forgetting, reduce interference when merging model parameters across tasks, and improve compression efficiency. In this context, developing an effective pruning strategy for fine-tuned models is crucial. Leveraging the advantages of the task vector mechanism, we preprocess fine-tuned models by calculating the differences between them and the original model. Recognizing that different task vector subspaces contribute variably to model performance, we introduce a novel method called Neural Parameter Search (NPS-Pruning) for slimming down fine-tuned models. This method enhances pruning efficiency by searching through neural parameters of task vectors within low-rank subspaces. Our method has three key applications: enhancing knowledge transfer through pairwise model interpolation, facilitating effective knowledge fusion via model merging, and enabling the deployment of compressed models that retain near-original performance while significantly reducing storage costs. Extensive experiments across vision, NLP, and multi-modal benchmarks demonstrate the effectiveness and robustness of our approach, resulting in substantial performance gains. The code is publicly available at: https://github.com/duguodong7/NPS-Pruning.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | 8 Vision Tasks (test) | Avg Accuracy87.6 | 47 | |
| Image Classification | 8 Vision Tasks Aggregate ViT-B/32 ViT-L/14 (average) | Accuracy94.3 | 31 | |
| Emotion Classification | 5 Emotion Domains (test) | Average Accuracy40.9 | 20 | |
| Natural Language Processing | 7 NLP Tasks (test) | Average Accuracy82.1 | 20 | |
| Natural Language Processing | 7 NLP Tasks Aggregate T5-Base T5-Large (average) | Accuracy (%)88.8 | 12 | |
| Parameter-Efficient Fine-Tuning | 11 PEFT Tasks (test) | Avg Accuracy68.2 | 10 | |
| Large Language Model Reasoning | 3 LLM Tasks (CMMLU, GSM8K, HumanEval) (test) | Average Accuracy35.3 | 7 | |
| Large Language Modeling | 3 LLM Tasks Aggregate LLaMa2 (average) | Accuracy0.405 | 6 | |
| Knowledge Transfer | LLaVA-1.5 Evaluation Suite OKVQA | VQAv2 Accuracy76.81 | 6 | |
| Knowledge Transfer | LLaVA Evaluation Suite Flickr30k 1.5 | VQAv2 Accuracy74.3 | 6 |