SimMark: A Robust Sentence-Level Similarity-Based Watermarking Algorithm for Large Language Models

About

The widespread adoption of large language models (LLMs) necessitates reliable methods to detect LLM-generated text. We introduce SimMark, a robust sentence-level watermarking algorithm that makes LLMs' outputs traceable without requiring access to model internals, making it compatible with both open and API-based LLMs. By leveraging the similarity of semantic sentence embeddings combined with rejection sampling to embed detectable statistical patterns imperceptible to humans, and employing a soft counting mechanism, SimMark achieves robustness against paraphrasing attacks. Experimental results demonstrate that SimMark sets a new benchmark for robust watermarking of LLM-generated content, surpassing prior sentence-level watermarking techniques in robustness, sampling efficiency, and applicability across diverse domains, all while maintaining the text quality and fluency.

Amirhossein Dabiriaghdam, Lele Wang• 2025

Related benchmarks

Task	Dataset	Result
Watermark Detection	BookSum	TP @ FP=1%70.8	154
Watermark Detection	C4	TPR @ FPR=1%0.138	95
Watermarking	Natural Questions (NQ) (test)	AUROC100	45
Sentence-Level Watermarking	C4	AUROC98.3	40
Watermark Detection	C4	Detection Accuracy (No Attack)77.6	24
Watermarking Detection	BookSum (test)	Detection Rate (No Attack)88.2	24
Watermarking Token Efficiency	BookSum (test)	Avg Tokens per Sentence186.7	5

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord