Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models

About

The unlearning problem of deep learning models, once primarily an academic concern, has become a prevalent issue in the industry. The significant advances in text-to-image generation techniques have prompted global discussions on privacy, copyright, and safety, as numerous unauthorized personal IDs, content, artistic creations, and potentially harmful materials have been learned by these models and later utilized to generate and distribute uncontrolled content. To address this challenge, we propose \textbf{Forget-Me-Not}, an efficient and low-cost solution designed to safely remove specified IDs, objects, or styles from a well-configured text-to-image model in as little as 30 seconds, without impairing its ability to generate other content. Alongside our method, we introduce the \textbf{Memorization Score (M-Score)} and \textbf{ConceptBench} to measure the models' capacity to generate general concepts, grouped into three primary categories: ID, object, and style. Using M-Score and ConceptBench, we demonstrate that Forget-Me-Not can effectively eliminate targeted concepts while maintaining the model's performance on other concepts. Furthermore, Forget-Me-Not offers two practical extensions: a) removal of potentially harmful or NSFW content, and b) enhancement of model accuracy, inclusion and diversity through \textbf{concept correction and disentanglement}. It can also be adapted as a lightweight model patch for Stable Diffusion, allowing for concept manipulation and convenient distribution. To encourage future research in this critical area and promote the development of safe and inclusive generative models, we will open-source our code and ConceptBench at \href{https://github.com/SHI-Labs/Forget-Me-Not}{https://github.com/SHI-Labs/Forget-Me-Not}.

Eric Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, Humphrey Shi• 2023

Related benchmarks

Task	Dataset	Result
Text-to-Image Generation	MS-COCO	FID24.32	193
Text-to-Image Generation	COCO 30k	FID12.53	77
Coarse-grained Unlearning	Imagenette	Atar24	70
Text-to-Image Alignment	MS-COCO	CLIP Score30.56	68
Class Erasure	Imagenette	UA93.8	66
Object Erasure	CIFAR-10	Accuracy (Erase)99.46	62
Text-to-Image Generation	MSCOCO 30K	FID13.99	54
Nudity Erasure	I2P	Total Count356	52
Explicit Content Removal	I2P	Buttocks Count12	47
Class-wise Forgetting	ImageNette (val)	FID0.8	44

Showing 10 of 116 rows

...

Other info

Follow for update

@wizwand_team Discord