Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MirrorMark: A Distortion-Free Multi-Bit Watermark for Large Language Models

About

As large language models (LLMs) become integral to applications such as question answering and content creation, reliable content attribution has become increasingly important. Watermarking is a promising approach, but existing methods either provide only binary signals or distort the sampling distribution, degrading text quality; distortion-free approaches, in turn, often suffer from weak detectability or robustness. We propose MirrorMark, a multi-bit and distortion-free watermark for LLMs. By mirroring sampling randomness in a measure-preserving manner, MirrorMark embeds multi-bit messages without altering the token probability distribution, preserving text quality by design. To improve robustness, we introduce a context-based scheduler that balances token assignments across message positions while remaining resilient to insertions and deletions. We further provide a theoretical analysis of the equal error rate to interpret empirical performance. Experiments show that MirrorMark matches the text quality of non-watermarked generation while achieving substantially stronger detectability: with 54 bits embedded in 300 tokens, it improves bit accuracy by 8-12% and correctly identifies up to 11% more watermarked texts at 1% false positive rate.

Ya Jiang, Massieh Kordi Boroujeny, Surender Suresh Kumar, Kai Zeng• 2026

Related benchmarks

TaskDatasetResultRank
Multi-bit WatermarkingLLaMA2-7B 300 tokens (test)
Perplexity7.0486
14
Multi-bit WatermarkingLLM text 200 tokens
Perplexity7.5709
14
Watermark Detectability400-token texts paraphrasing attack (test)
AUC93.06
13
Multi-bit WatermarkingLLM text generation 400 tokens, 36 bits
AUC1
12
Text Quality EvaluationLLM-generated text 300 tokens 36 bits
Distinct-294.94
12
Multi-bit WatermarkingLLM text generation 400 tokens, 54 bits
Perplexity6.8855
7
Instruction FollowingELI5 prompts Gemma-7B-it 200 tokens (test)
Perplexity1.7784
6
Multi-bit Watermarking400-token texts epsilon=0.2 (eval)
AUC100
6
Multi-bit Watermarking400-token texts epsilon=0.4 (eval)
AUC1
6
Multi-bit WatermarkingCopy-paste attack 400 tokens, 36 bits epsilon=0.1 edit fraction 0.1
AUC1
6
Showing 10 of 12 rows

Other info

Follow for update