Improving Fisher Information Estimation and Efficiency for LoRA-based LLM Unlearning

About

LLMs have demonstrated remarkable performance across various tasks but face challenges related to unintentionally generating outputs containing sensitive information. A straightforward approach to address this issue is to retrain the model after excluding the problematic data. However, this approach incurs prohibitively high computational costs. To overcome this limitation, machine unlearning has emerged as a promising solution that can effectively remove sensitive information without the need to retrain the model from scratch. Recently, FILA has been proposed as a parameter-efficient unlearning method by integrating LoRA adapters. Specifically, it calculates the Fisher information to identify parameters associated with the forget set and assigns them to LoRA adapters for updates. Despite its innovative approach, FILA still requires access to all model parameters and does not adequately account for fundamental assumptions underlying Fisher information, leading to inaccuracies in importance estimation. To address these limitations, we propose VILA, a novel unlearning framework that explicitly considers the assumptions overlooked in FILA, thereby enhancing the accuracy of parameter identification for the forget set. Moreover, VILA significantly reduces computational costs by enabling parameter identification without accessing the entire model. Our method achieves up to 100x higher parameter efficiency and 40x faster training speed compared to FILA, and sets new state-of-the-art performance on benchmarks including TOFU, WMDP, and MUSE. Our code is available at https://github.com/kyj93790/VILA.

Yejin Kim, Eunwon Kim, Buru Chang, Junsuk Choe• 2025

Related benchmarks

Task	Dataset	Result
Language Model Unlearning	TOFU Forget10	Forget Quality (FQ)5.10e-15	54
Machine Unlearning	TOFU 1.0 (Forget10)	Model Utility (MU)0.00e+0	53
Machine Unlearning	TOFU 1.0 (forget01)	--	53
Machine Unlearning	TOFU Forget01 (1% authors)	Forget Quality (Rouge-L)0.03	48
Machine Unlearning	TOFU Forget10 (10% authors split)	Forget Quality - Rouge-L0.02	42
Machine Unlearning	TOFU Forget05 (5% authors)	Forget Quality (ROUGE-L)0.02	42
Machine Unlearning	TOFU Forget05 Phi-1.5B model (5%)	Model Utility (MU)0.00e+0	32
Machine Unlearning	TOFU Forget10 Phi-1.5B model	Forget Quality (FQ)5.50e-17	24
Machine Unlearning	TOFU Forget01 Phi-1.5B model (1%)	Forget Quality (Rouge-L)4	24
Machine Unlearning	WMDP	Acc (Bio)63.6	21

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord