DeMark: A Query-Free Black-Box Attack on Deepfake Watermarking Defenses

About

The rapid proliferation of realistic deepfakes has raised urgent concerns over their misuse, motivating the use of defensive watermarks in synthetic images for reliable detection and provenance tracking. However, this defense paradigm assumes such watermarks are inherently resistant to removal. We challenge this assumption with DeMark, a query-free black-box attack framework that targets defensive image watermarking schemes for deepfakes. DeMark exploits latent-space vulnerabilities in encoder-decoder watermarking models through a compressive sensing based sparsification process, suppressing watermark signals while preserving perceptual and structural realism appropriate for deepfakes. Across eight state-of-the-art watermarking schemes, DeMark reduces watermark detection accuracy from 100% to 32.9% on average while maintaining natural visual quality, outperforming existing attacks. We further evaluate three defense strategies, including image super resolution, sparse watermarking, and adversarial training, and find them largely ineffective. These results demonstrate that current encoder decoder watermarking schemes remain vulnerable to latent-space manipulations, underscoring the need for more robust watermarking methods to safeguard against deepfakes.

Wei Song, Zhenchang Xing, Liming Zhu, Yulei Sui, Jingling Xue• 2026

Related benchmarks

Task	Dataset	Result
Post-attack image integrity	COCO	PSNR29.27	24
Post-attack image integrity	OpenImage	PSNR29.63	24
Watermark Removal Attack	SS in-processing watermarking scheme	Bit Accuracy55.9	13
Watermark Attack	OpenImage	MBRS Bit Accuracy63.6	5
Watermark Attack	COCO	MBRS Bit Accuracy60.3	5
Watermark Removal Attack	PTW in-processing	BitAcc0.683	5
Post-attack image integrity	PTW	PSNR32.46	4
Post-attack image integrity	SS	PSNR29.43	4

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord