MACE: Mass Concept Erasure in Diffusion Models

About

The rapid expansion of large-scale text-to-image diffusion models has raised growing concerns regarding their potential misuse in creating harmful or misleading content. In this paper, we introduce MACE, a finetuning framework for the task of mass concept erasure. This task aims to prevent models from generating images that embody unwanted concepts when prompted. Existing concept erasure methods are typically restricted to handling fewer than five concepts simultaneously and struggle to find a balance between erasing concept synonyms (generality) and maintaining unrelated concepts (specificity). In contrast, MACE differs by successfully scaling the erasure scope up to 100 concepts and by achieving an effective balance between generality and specificity. This is achieved by leveraging closed-form cross-attention refinement along with LoRA finetuning, collectively eliminating the information of undesirable concepts. Furthermore, MACE integrates multiple LoRAs without mutual interference. We conduct extensive evaluations of MACE against prior methods across four different tasks: object erasure, celebrity erasure, explicit content erasure, and artistic style erasure. Our results reveal that MACE surpasses prior methods in all evaluated tasks. Code is available at https://github.com/Shilin-LU/MACE.

Shilin Lu, Zilan Wang, Leyang Li, Yanzhu Liu, Adams Wai-Kin Kong• 2024

Related benchmarks

Task	Dataset	Result
Text-to-Image Generation	MS-COCO	FID18.36	145
Continual Concept Learning	10 Sequential Concepts (test)	UA99	70
Text-to-Image Generation	COCO 30k	FID12.4	63
Object Erasure	CIFAR-10	Accuracy (Erase)13.47	62
Text-to-Image Alignment	MS-COCO	CLIP Score31.05	60
Text-to-Image Generation	MSCOCO 30K	FID12.71	54
Explicit Content Removal	I2P	Buttocks Count2	47
Nudity Erasure	I2P	Total Count256	44
Art Style Erasure	Artist style and content prompts 5 groups SD v1.4 based (test)	CS Style28.452	40
Concept Erasure	Van Gogh style	FID59.58	39

Showing 10 of 187 rows

...

Other info

Code

Follow for update

@wizwand_team Discord