Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning

About

The surge in Large Language Models (LLMs) has revolutionized natural language processing, but fine-tuning them for specific tasks often encounters challenges in balancing performance and preserving general instruction-following abilities. In this paper, we posit that the distribution gap between task datasets and the LLMs serves as the primary underlying cause. To address the problem, we introduce Self-Distillation Fine-Tuning (SDFT), a novel approach that bridges the distribution gap by guiding fine-tuning with a distilled dataset generated by the model itself to match its original distribution. Experimental results on the Llama-2-chat model across various benchmarks demonstrate that SDFT effectively mitigates catastrophic forgetting while achieving comparable or superior performance on downstream tasks compared to the vanilla fine-tuning. Moreover, SDFT demonstrates the potential to maintain the helpfulness and safety alignment of LLMs. Our code is available at https://github.com/sail-sg/sdft.

Zhaorui Yang, Tianyu Pang, Haozhe Feng, Han Wang, Wei Chen, Minfeng Zhu, Qian Liu• 2024

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy34.4
983
Code GenerationHumanEval
Pass@118.3
850
Multi-task Language UnderstandingMMLU
Accuracy57.67
842
Language UnderstandingMMLU
Accuracy84.13
756
Question AnsweringARC Challenge
Accuracy82.42
749
Mathematical ReasoningMATH
Accuracy7.34
643
ReasoningBBH
Accuracy71.01
507
Question AnsweringARC Easy
Normalized Acc90.91
385
Mathematical ReasoningSVAMP
Accuracy18.2
368
Reading ComprehensionRACE high
Accuracy74.59
295
Showing 10 of 27 rows

Other info

Code

Follow for update