Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Improving Fisher Information Estimation and Efficiency for LoRA-based LLM Unlearning

About

LLMs have demonstrated remarkable performance across various tasks but face challenges related to unintentionally generating outputs containing sensitive information. A straightforward approach to address this issue is to retrain the model after excluding the problematic data. However, this approach incurs prohibitively high computational costs. To overcome this limitation, machine unlearning has emerged as a promising solution that can effectively remove sensitive information without the need to retrain the model from scratch. Recently, FILA has been proposed as a parameter-efficient unlearning method by integrating LoRA adapters. Specifically, it calculates the Fisher information to identify parameters associated with the forget set and assigns them to LoRA adapters for updates. Despite its innovative approach, FILA still requires access to all model parameters and does not adequately account for fundamental assumptions underlying Fisher information, leading to inaccuracies in importance estimation. To address these limitations, we propose VILA, a novel unlearning framework that explicitly considers the assumptions overlooked in FILA, thereby enhancing the accuracy of parameter identification for the forget set. Moreover, VILA significantly reduces computational costs by enabling parameter identification without accessing the entire model. Our method achieves up to 100x higher parameter efficiency and 40x faster training speed compared to FILA, and sets new state-of-the-art performance on benchmarks including TOFU, WMDP, and MUSE. Our code is available at https://github.com/kyj93790/VILA.

Yejin Kim, Eunwon Kim, Buru Chang, Junsuk Choe• 2025

Related benchmarks

TaskDatasetResultRank
Machine UnlearningTOFU Forget01 (1% authors)
Forget Quality (Rouge-L)0.03
48
Machine UnlearningTOFU Forget10 (10% authors split)
Forget Quality - Rouge-L0.02
42
Machine UnlearningTOFU Forget05 (5% authors)
Forget Quality (ROUGE-L)0.02
42
Machine UnlearningTOFU 1.0 (forget01)
MU Score0.00e+0
33
Machine UnlearningTOFU Forget10 Phi-1.5B model
Forget Quality (FQ)5.50e-17
24
Machine UnlearningTOFU Forget01 Phi-1.5B model (1%)
Forget Quality (Rouge-L)4
24
Machine UnlearningTOFU Forget05 Phi-1.5B model (5%)
Forget Quality (Rouge-L)0.09
20
Language Model UnlearningTOFU Forget10
Forget Quality (FQ)5.10e-15
15
Machine UnlearningTOFU 1.0 (Forget05)
Forget ROUGE-L0.00e+0
10
Machine UnlearningTOFU 1.0 (Forget10)
Forget Rouge-L0.00e+0
10
Showing 10 of 10 rows

Other info

Follow for update