Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Unveiling Entity-Level Unlearning for Large Language Models: A Comprehensive Analysis

About

Large language model unlearning has garnered increasing attention due to its potential to address security and privacy concerns, leading to extensive research in the field. However, much of this research has concentrated on instance-level unlearning, specifically targeting the removal of predefined instances containing sensitive content. This focus has left a significant gap in the exploration of full entity-level unlearning, which is critical in real-world scenarios such as copyright protection. To this end, we propose a novel task of Entity-level unlearning, which aims to erase entity-related knowledge from the target model completely. To thoroughly investigate this task, we systematically evaluate trending unlearning algorithms, revealing that current methods struggle to achieve effective entity-level unlearning. Then, we further explore the factors that influence the performance of the unlearning algorithms, identifying that knowledge coverage and the size of the forget set play pivotal roles. Notably, our analysis also uncovers that entities introduced through fine-tuning are more vulnerable to unlearning than pre-trained entities. These findings collectively offer valuable insights for advancing entity-level unlearning for LLMs.

Weitao Ma, Xiaocheng Feng, Weihong Zhong, Lei Huang, Yangfan Ye, Xiachong Feng, Bing Qin• 2024

Related benchmarks

TaskDatasetResultRank
General Language Model EvaluationUtility Set MMLU, BBH, TruthfulQA, TriviaQA, AlpacaEval
MMLU68.57
34
Knowledge UnlearningRWKU (Forget Set)
FB65.39
23
Knowledge RetentionRWKU (Neighbor Set)
FB Score64.64
17
UnlearningTOFU Neighbor Set
FB Score66.85
17
Membership Inference AttackTOFU MIA Set
FM2.0591
17
UnlearningTOFU Forget Set
FB66.75
17
Membership Inference AttackRWKU MIA Set
FM Score1.9559
17
Machine UnlearningTOFU finetuned Llama-2-7b-chat (forget set)
Probability98.43
14
Machine UnlearningTOFU
Probability77.31
13
Showing 9 of 9 rows

Other info

Follow for update