Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SemBind: Binding Diffusion Watermarks to Semantics Against Black-Box Forgery Attacks

About

Latent-based watermarks, integrated into the generation process of latent diffusion models (LDMs), simplify detection and attribution of generated images. However, recent black-box forgery attacks, where an attacker needs at least one watermarked image and black-box access to the provider's model, can embed the provider's watermark into images not produced by the provider, posing outsized risk to provenance and trust. We propose SemBind, the first defense framework for latent-based watermarks that resists black-box forgery by binding latent signals to image semantics via a learned semantic masker. Trained with contrastive learning, the masker yields near-invariant codes for the same prompt and near-orthogonal codes across prompts; these codes are reshaped and permuted to modulate the target latent before any standard latent-based watermark. SemBind is generally compatible with existing latent-based watermarking schemes and keeps image quality essentially unchanged, while a simple mask-ratio parameter offers a tunable trade-off between anti-forgery strength and robustness. Across four mainstream latent-based watermark methods, our SemBind-enabled anti-forgery variants markedly reduce false acceptance under black-box forgery while providing a controllable robustness-security balance.

Xin Zhang, Zijin Yang, Kejiang Chen, Linfeng Ma, Weiming Zhang, Nenghai Yu• 2026

Related benchmarks

TaskDatasetResultRank
Imprinting AttackCOCO
Detection Rate0.00e+0
54
Imprint Forgery AttackSDP prompt v1 (val)
Detection Rate49
48
Reprompt Forgery AttackSDP prompt (val)
Detection Rate60
16
Reprompt Forgery AttackCOCO prompt (val)
Detection Rate50
16
Reprompt Forgery AttackFredZhang7/stable-diffusion-prompts SD 2.1 attacker 2.47M
Detection Rate56
8
Reprompt Forgery AttackFredZhang7/stable-diffusion-prompts-2.47M SD 1.5 attacker
Detection Rate50
8
Imprinting AttackSDP
Detection Rate0.00e+0
6
Reprompting AttackCOCO
Detection Rate1
2
Reprompting AttackSDP
Detection Rate0.00e+0
2
Showing 9 of 9 rows

Other info

Follow for update