Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CoreUnlearn: Rethinking Concept Unlearning through Disentangled Component-Level Erasure in Text-guided Diffusion Models

About

Text guided diffusion models have revolutionized image synthesis but also raise ethical concerns, such as privacy violation and harmful content generation. To mitigate these issues, prevailing methods typically leverage an alignment mechanism, with predefined erasure references, to fine-tune pretrained model weights. However, these techniques are intrinsically limited by the representational capacity of textual space and display high sensitivity to the choice of predefined erasure references, e.g., suboptimal references may significantly affect the model utility preservation during erasure. To overcome these limitations, we introduce CoreUnlearn, aiming to disentangle and remove the erasure-critical component of the undesirable concept. Specifically, CoreUnlearn comprises a Component Extraction Module (CEM) and a Swap Disentangling Strategy (SDS). Guided by SDS, CEM is pre-trained to decompose concept embeddings into distinct component types. Leveraging this decomposition, CoreUnlearn then removes the erasure-critical component while retaining non-critical ones by fine-tuning model weights. Extensive experiments demonstrate that CoreUnlearn achieves effective concept erasure with minimal impact on overall model performance.

Mengnan Zhao, Lihe Zhang, Baocai Yin• 2026

Related benchmarks

TaskDatasetResultRank
Explicit Content RemovalI2P
Buttocks Count2
47
Object UnlearningImagenette Object Unlearning Subset
ERASE FID259.5
16
Style UnlearningArtistic Styles SD v1.4 (test)
ERASE FID294.2
16
Concept UnlearningI2P Stable Diffusion v1.4
Erase ACC12.65
7
Object UnlearningSD 2
ERASE FID244.6
5
Style UnlearningSD 2
ERASE FID212
5
Showing 6 of 6 rows

Other info

Follow for update