Unsafe2Safe: Controllable Image Anonymization for Downstream Utility

About

Large-scale image datasets frequently contain identifiable or sensitive content, raising privacy risks when training models that may memorize and leak such information. We present Unsafe2Safe, a fully automated pipeline that detects privacy-prone images and rewrites only their sensitive regions using multimodally guided diffusion editing. Unsafe2Safe operates in two stages. Stage 1 uses a vision-language model to (i) inspect images for privacy risks, (ii) generate paired private and public captions that respectively include and omit sensitive attributes, and (iii) prompt a large language model to produce structured, identity-neutral edit instructions conditioned on the public caption. Stage 2 employs instruction-driven diffusion editors to apply these dual textual prompts, producing privacy-safe images that preserve global structure and task-relevant semantics while neutralizing private content. To measure anonymization quality, we introduce a unified evaluation suite covering Quality, Cheating, Privacy, and Utility dimensions. Across MS-COCO, Caltech101, and MIT Indoor67, Unsafe2Safe reduces face similarity, text similarity, and demographic predictability by large margins, while maintaining downstream model accuracy comparable to training on raw data. Fine-tuning diffusion editors on our automatically generated triplets (private caption, public caption, edit instruction) further improves both privacy protection and semantic fidelity. Unsafe2Safe provides a scalable, principled solution for constructing large, privacy-safe datasets without sacrificing visual consistency or downstream utility.

Mih Dinh, SouYoung Jin• 2026

Related benchmarks

Task	Dataset	Result
Anonymization	Cal101	Accuracy94.926	14
Anonymization	Indoor	Accuracy82.537	14
Image Classification	MIT Indoor 67	Accuracy80.746	8
Image Anonymization Evaluation	Caltech101	CLIP Score30.94	7
Image Anonymization Evaluation	MIT Indoor 67	CLIP Score34.48	7

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord