ICED: Concept-level Machine Unlearning via Interpretable Concept Decomposition

About

Machine unlearning in Vision-Language Models (VLMs) is typically performed at the image or instance level, making it difficult to precisely remove target knowledge without affecting unrelated semantics. This issue is especially pronounced since a single image often contains multiple entangled concepts, including both target concepts to be forgotten and contextual information that should be preserved. In this paper, we propose an interpretable concept-level unlearning framework for VLMs, which constructs a compact task-specific concept vocabulary from the forgetting set using a multimodal large language model. In addition to modality alignment, visual representations are decomposed into sparse, nonnegative combinations of semantic concepts, providing an explicit interface for fine-grained knowledge manipulation. Based on this decomposition, our method formulates unlearning as concept-level optimization, where target concepts are selectively suppressed while intra-instance non-target semantics and global cross-modal knowledge are preserved. Extensive experiments across both in-domain and out-of-domain forgetting settings demonstrate that our method enables more comprehensive target forgetting, better preserves non-target knowledge within the same image, and maintains competitive model utility compared with existing VLM unlearning methods.

Shen Lin, Jing Lin, Junhao Dong, Piotr Koniusz, Li Xu• 2026

Related benchmarks

Task	Dataset	Result
Machine Unlearning	ImageNet	Utility Preservation49.13	33
Machine Unlearning	ImageNet Retain set	Zero-Shot Retention Accuracy53.67	18
Machine Unlearning	Food	Accuracy78.58	18
Machine Unlearning	ImageNet-1K All Classes	Zero-Shot Accuracy53.42	18
Zero-shot Image Classification	Food	Zero-shot Accuracy78.57	18
Zero-shot Image Classification	STL	Zero-shot Accuracy0.9592	18
Zero-shot Image Classification	ObjectNet	Zero-shot Accuracy26.86	18
Zero-shot Image Classification	CIFAR-10	Zero-shot Accuracy70.02	18
Machine Unlearning	STL	Accuracy89.46	18
Machine Unlearning	ObjectNet	Accuracy24.52	18

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord