Image Can Bring Your Memory Back: A Novel Multi-Modal Guided Attack against Image Generation Model Unlearning

About

Recent advances in image generation models (IGMs), particularly diffusion-based architectures such as Stable Diffusion (SD), have markedly enhanced the quality and diversity of AI-generated visual content. However, their generative capability has also raised significant ethical, legal, and societal concerns, including the potential to produce harmful, misleading, or copyright-infringing content. To mitigate these concerns, machine unlearning (MU) emerges as a promising solution by selectively removing undesirable concepts from pretrained models. Nevertheless, the robustness and effectiveness of existing unlearning techniques remain largely unexplored, particularly in the presence of multi-modal adversarial inputs. To bridge this gap, we propose Recall, a novel adversarial framework explicitly designed to compromise the robustness of unlearned IGMs. Unlike existing approaches that predominantly rely on adversarial text prompts, Recall exploits the intrinsic multi-modal conditioning capabilities of diffusion models by efficiently optimizing adversarial image prompts with guidance from a single semantically relevant reference image. Extensive experiments across ten state-of-the-art unlearning methods and diverse tasks show that Recall consistently outperforms existing baselines in terms of adversarial effectiveness, computational efficiency, and semantic fidelity with the original textual prompt. These findings reveal critical vulnerabilities in current unlearning mechanisms and underscore the need for more robust solutions to ensure the safety and reliability of generative models. Code and data are publicly available at \textcolor{blue}{https://github.com/ryliu68/RECALL}.

Renyang Liu, Guanlin Li, Tianwei Zhang, See-Kiong Ng• 2025

Related benchmarks

Task	Dataset	Result
Semantic Alignment	Nudity-I2P	CLIP Score32.13	31
Semantic Alignment	Van Gogh	CLIP Score35.92	31
Semantic Alignment	Parachute	CLIP Score31.1	31
Semantic Alignment	Church	CLIP Score30.37	30
Nudity Unlearning	I2P	ESD71.83	11
Style Unlearning	Van Gogh style	ESD92	11
Object Unlearning	Object Church	ESD96	11
Object Unlearning	Object-Parachute	ESD100	11
Nudity Unlearning	MMA	ESD75.78	10
Nudity Unlearning	ArT	ESD62.5	10

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord