Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ICED: Concept-level Machine Unlearning via Interpretable Concept Decomposition

About

Machine unlearning in Vision-Language Models (VLMs) is typically performed at the image or instance level, making it difficult to precisely remove target knowledge without affecting unrelated semantics. This issue is especially pronounced since a single image often contains multiple entangled concepts, including both target concepts to be forgotten and contextual information that should be preserved. In this paper, we propose an interpretable concept-level unlearning framework for VLMs, which constructs a compact task-specific concept vocabulary from the forgetting set using a multimodal large language model. In addition to modality alignment, visual representations are decomposed into sparse, nonnegative combinations of semantic concepts, providing an explicit interface for fine-grained knowledge manipulation. Based on this decomposition, our method formulates unlearning as concept-level optimization, where target concepts are selectively suppressed while intra-instance non-target semantics and global cross-modal knowledge are preserved. Extensive experiments across both in-domain and out-of-domain forgetting settings demonstrate that our method enables more comprehensive target forgetting, better preserves non-target knowledge within the same image, and maintains competitive model utility compared with existing VLM unlearning methods.

Shen Lin, Jing Lin, Junhao Dong, Piotr Koniusz, Li Xu• 2026

Related benchmarks

TaskDatasetResultRank
Machine UnlearningImageNet
Utility Preservation49.13
33
Machine UnlearningImageNet Retain set
Zero-Shot Retention Accuracy53.67
18
Machine UnlearningFood
Accuracy78.58
18
Machine UnlearningImageNet-1K All Classes
Zero-Shot Accuracy53.42
18
Zero-shot Image ClassificationFood
Zero-shot Accuracy78.57
18
Zero-shot Image ClassificationSTL
Zero-shot Accuracy0.9592
18
Zero-shot Image ClassificationObjectNet
Zero-shot Accuracy26.86
18
Zero-shot Image ClassificationCIFAR-10
Zero-shot Accuracy70.02
18
Machine UnlearningSTL
Accuracy89.46
18
Machine UnlearningObjectNet
Accuracy24.52
18
Showing 10 of 14 rows

Other info

Follow for update