Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Erasing Thousands of Concepts: Towards Scalable and Practical Concept Erasure for Text-to-Image Diffusion Models

About

Large-scale text-to-image (T2I) diffusion models deliver remarkable visual fidelity but pose safety risks due to their capacity to reproduce undesirable content, such as copyrighted ones. Concept erasure has emerged as a mitigation strategy, yet existing approaches struggle to balance scalability, precision, and robustness, which restricts their applicability to erasing only a few hundred concepts. To address these limitations, we present Erasing Thousands of Concepts (ETC), a scalable framework capable of erasing thousands of concepts while preserving generation quality. Our method first models low-rank concept distributions via a Student's t-distribution Mixture Model (tMM). It enables pin-point erasure of target concepts via affine optimal transport while preserving others by anchoring the boundaries of target concept distributions without pre-defined anchor concepts. We then train a Mixture-of-Experts (MoE)-based module, termed MoEraser, which removes target embeddings while preserving the anchor embeddings. By injecting noise into the text embedding projector and fine-tuning MoEraser for recovery, our framework achieves robustness to white-box attack such as module removal. Extensive experiments on over 2,000 concepts across heterogeneous domains and diffusion models demerate state-of-the-art scalability and precision in large-scale concept erasure.

Hoigi Seo, Byung Hyun Lee, Jaehyun Cho, Sungjin Lim, Se Young Chun• 2026

Related benchmarks

TaskDatasetResultRank
Text-to-Image GenerationCOCO 30k
FID13.61
63
Explicit Content RemovalI2P
Buttocks Count0.00e+0
47
Content PreservationMS-COCO (30K)
FID14.06
19
Concept Preservation100 Artistic Styles
CLIP Score29.14
10
Concept Erasure and Preservation50 Target Celebrities and 100 Remaining Celebrities
Accuracy (Target)0.24
10
Concept Erasure RobustnessRing-A-Bell (RAB)
Attack Success Rate1.42
7
Concept Erasure RobustnessUnlearnDiff (UD)
Attack Success Rate52.82
7
Concept ErasureCharacters SDv1.4 (Target (430) Remain (279))
CRSt0.13
6
Concept ErasureCelebrities SDv1.4 (Target Remain)
CRSt9.9
6
Concept ErasureArtistic Style SD v1.4 (Target (693) Remain (430))
CR Score (Target)13
6
Showing 10 of 19 rows

Other info

Follow for update