Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MirrorMark: Generalizable Mirrored Sampling for Multi-bit LLM Watermarking

About

As large language models (LLMs) become integral to applications such as question answering and content creation, reliable content attribution has become increasingly important. Watermarking is a promising approach, but most existing methods either provide only binary signals or achieve multi-bit embedding by distorting the generation distribution. We propose MirrorMark, a generalizable mapping-centric approach for multi-bit LLM watermarking. MirrorMark separates the symbol mapping rule from the base watermarking sampler and maps each symbol to a mod-1 mirroring transformation of a detector-reproducible pseudorandom object, such as sampling values or permutation ranks. A binary-tokenizer analysis shows that complementary mappings yield larger matched--mismatched score gaps than independent-key or shift-based mappings. When composed with a distortion-free base sampler, MirrorMark preserves the token probability distribution by design and maintains text quality in practice. To support practical payload embedding, we introduce a Context-Anchored Balanced Scheduler (CABS), which balances token assignments across message positions while localizing edit effects. We further provide theoretical EER analyses for two representative sampler instantiations. Experiments show that MirrorMark achieves strong detectability and bit accuracy while maintaining text quality comparable to non-watermarked generation.

Ya Jiang, Massieh Kordi Boroujeny, Surender Suresh Kumar, Kai Zeng• 2026

Related benchmarks

TaskDatasetResultRank
Multi-bit WatermarkingLLaMA2-7B 300 tokens (test)
Perplexity7.0486
14
Multi-bit WatermarkingLLM text 200 tokens
Perplexity7.5709
14
Watermark Detectability400-token texts paraphrasing attack (test)
AUC93.06
13
Multi-bit WatermarkingLLM text generation 400 tokens, 36 bits
AUC1
12
Text Quality EvaluationLLM-generated text 300 tokens 36 bits
Distinct-294.94
12
Distortion-Free Watermark EvaluationELI5 16-bit bitstrings LLAMA3.1-8B
Message Accuracy5.7
8
Multi-bit WatermarkingLLM text generation 400 tokens, 54 bits
Perplexity6.8855
7
Synonym Substitution RobustnessELI5 prompts 32-bit payload Llama3.1-8B (test)
Bit Accuracy (10% Substitution)54.4
7
Word Deletion RobustnessELI5 prompts 32-bit payload Llama3.1-8B (test)
Bit Accuracy (10% Deletion)53.9
7
Paraphrasing RobustnessELI5 prompts 32-bit payload Llama3.1-8B (test)
Bit Accuracy49.8
7
Showing 10 of 18 rows

Other info

Follow for update