Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation

About

Existing watermarking algorithms are vulnerable to paraphrase attacks because of their token-level design. To address this issue, we propose SemStamp, a robust sentence-level semantic watermarking algorithm based on locality-sensitive hashing (LSH), which partitions the semantic space of sentences. The algorithm encodes and LSH-hashes a candidate sentence generated by an LLM, and conducts sentence-level rejection sampling until the sampled sentence falls in watermarked partitions in the semantic embedding space. A margin-based constraint is used to enhance its robustness. To show the advantages of our algorithm, we propose a "bigram" paraphrase attack using the paraphrase that has the fewest bigram overlaps with the original sentence. This attack is shown to be effective against the existing token-level watermarking method. Experimental results show that our novel semantic watermark algorithm is not only more robust than the previous state-of-the-art method on both common and bigram paraphrase attacks, but also is better at preserving the quality of generation.

Abe Bohan Hou, Jingyu Zhang, Tianxing He, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, Yulia Tsvetkov• 2023

Related benchmarks

TaskDatasetResultRank
Machine TranslationWMT De-En 19
COMET87.4
6
SummarizationXsum
ROUGE-L20
6
Code GenerationMBPP
Pass@134.2
5
Paragraph TranslationWMT 23 (test)
BLEU42.8
4
Open-ended generationC4 RealNews
Perplexity3.6
4
Showing 5 of 5 rows

Other info

Follow for update