Watermarking Diffusion Language Models

About

We introduce the first watermark tailored for diffusion language models (DLMs), an emergent LLM paradigm able to generate tokens in arbitrary order, in contrast to standard autoregressive language models (ARLMs) which generate tokens sequentially. While there has been much work in ARLM watermarking, a key challenge when attempting to apply these schemes directly to the DLM setting is that they rely on previously generated tokens, which are not always available with DLM generation. In this work we address this challenge by: (i) applying the watermark in expectation over the context even when some context tokens are yet to be determined, and (ii) promoting tokens which increase the watermark strength when used as context for other tokens. This is accomplished while keeping the watermark detector unchanged. Our experimental evaluation demonstrates that the DLM watermark leads to a >99% true positive rate with minimal quality impact and achieves similar robustness to existing ARLM watermarks, enabling for the first time reliable DLM watermarking.

Thibaud Gloaguen, Robin Staab, Nikola Jovanovi\'c, Martin Vechev• 2025

Related benchmarks

Task	Dataset	Result	Rank
Watermark Detection and Quality Evaluation	Aggregate (MMLU, HellaSwag, ARC-C, GPQA, MBPP, GSM8K)	TPR@10.99		18
Diffusion Language Model Watermarking	WaterBench 600 prompts 2024	PPL5.07		9

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord