Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models

About

Large language models generate high-quality responses with potential misinformation, underscoring the need for regulation by distinguishing AI-generated and human-written texts. Watermarking is pivotal in this context, which involves embedding hidden markers in texts during the LLM inference phase, which is imperceptible to humans. Achieving both the detectability of inserted watermarks and the semantic quality of generated texts is challenging. While current watermarking algorithms have made promising progress in this direction, there remains significant scope for improvement. To address these challenges, we introduce a novel multi-objective optimization (MOO) approach for watermarking that utilizes lightweight networks to generate token-specific watermarking logits and splitting ratios. By leveraging MOO to optimize for both detection and semantic objective functions, our method simultaneously achieves detectability and semantic integrity. Experimental results show that our method outperforms current watermarking techniques in enhancing the detectability of texts generated by LLMs while maintaining their semantic coherence. Our code is available at https://github.com/mignonjia/TS_watermark.

Mingjia Huo, Sai Ashish Somayajula, Youwei Liang, Ruisi Zhang, Farinaz Koushanfar, Pengtao Xie• 2024

Related benchmarks

TaskDatasetResultRank
Language ModelingLLaMA-2 13B
Perplexity (PPL)8.466
32
Watermark DetectionLlama-3 8B Instruct 30 tokens (generations)
Mean Precision9
13
Watermark DetectionLlama-3-8B-Instruct 150 tokens (generations)
Mean P0.35
13
Watermark DetectionLlama2-7B Copy-paste attack
F1 Score91.3
11
Watermark DetectionLlama2-7B Clean
F1 Score100
8
Watermark DetectionLlama2 Average 7B
F1 Score93.2
8
Machine TranslationNLLB-600M
BLEU28.355
8
Watermark DetectionLlama2-7B Synonymous substitution
F1 Score96.4
8
Watermark DetectionLlama2-7B Paragraphing
F1 Score84.9
8
Code GenerationStarcoder
pass@132
8
Showing 10 of 29 rows

Other info

Follow for update