IdentityGuard: Context-Aware Restriction and Provenance for Personalized Synthesis

About

The nature of personalized text-to-image models poses a unique safety challenge that generic context-blind methods are ill-equipped to handle. Such global filters create a dilemma: to prevent misuse, they are forced to damage the model's broader utility by erasing concepts entirely, causing unacceptable collateral damage.Our work presents a more precisely targeted approach, built on the principle that security should be as context-aware as the threat itself, intrinsically bound to the personalized concept. We present IDENTITYGUARD, which realizes this principle through a conditional restriction that blocks harmful content only when combined with the personalized identity, and a concept-specific watermark for precise traceability. Experiments show our approach prevents misuse while preserving the model's utility and enabling robust traceability. By moving beyond blunt, global filters, our work demonstrates a more effective and responsible path toward AI safety.

Lingyun Zhang, Yu Xie, Ping Chen• 2026

Related benchmarks

Task	Dataset	Result
Text-to-Image Generation	Benign Prompts	FID54.72	7
Text-to-Image Generation	Malicious Prompts	FID-Censored393.1	6
Watermark Robustness	Benign and Malicious Prompts	Bit Accuracy97.1	4
Nudity Detection	100 images generated from malicious 'naked' prompt (test)	Explicit Detections1	4

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord