Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

IdentityGuard: Context-Aware Restriction and Provenance for Personalized Synthesis

About

The nature of personalized text-to-image models poses a unique safety challenge that generic context-blind methods are ill-equipped to handle. Such global filters create a dilemma: to prevent misuse, they are forced to damage the model's broader utility by erasing concepts entirely, causing unacceptable collateral damage.Our work presents a more precisely targeted approach, built on the principle that security should be as context-aware as the threat itself, intrinsically bound to the personalized concept. We present IDENTITYGUARD, which realizes this principle through a conditional restriction that blocks harmful content only when combined with the personalized identity, and a concept-specific watermark for precise traceability. Experiments show our approach prevents misuse while preserving the model's utility and enabling robust traceability. By moving beyond blunt, global filters, our work demonstrates a more effective and responsible path toward AI safety.

Lingyun Zhang, Yu Xie, Ping Chen• 2026

Related benchmarks

TaskDatasetResultRank
Text-to-Image GenerationBenign Prompts
FID54.72
7
Text-to-Image GenerationMalicious Prompts
FID-Censored393.1
6
Watermark RobustnessBenign and Malicious Prompts
Bit Accuracy97.1
4
Nudity Detection100 images generated from malicious 'naked' prompt (test)
Explicit Detections1
4
Showing 4 of 4 rows

Other info

Follow for update