Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MarkTune: Improving the Quality-Detectability Trade-off in Open-Weight LLM Watermarking

About

Watermarking aims to embed hidden signals in generated text that can be reliably detected when given access to a secret key. Open-weight language models pose acute challenges for such watermarking schemes because the inference-time interventions that dominate contemporary approaches cannot be enforced once model weights are public. Existing watermaking techniques for open-weight models, such as the recently proposed GaussMark, typically rely on small modifications to model weights, which can yield signals detectable to those equipped with a secret key, but achieving detection power comparable to inference-time watermarks generally requires weight perturbations that noticeably reduce generation quality. We introduce MarkTune, a theoretically principled, on-policy fine-tuning framework that treats the GaussMark signal as a reward while simultaneously regularizing against degradation in text quality. We derive MarkTune as an improvement on GaussMark and demonstrate that MarkTune consistently improves the quality-detectability trade-off over GaussMark by steering finer-grained, watermark-aware weight updates within the model's representation space while preserving generation quality. Empirically, we show that MarkTune pushes the quality-detectability frontier of GaussMark close to that of inference-time watermarking, remains robust to paraphrasing and fine-tuning attacks, and exhibits strong generalization: a model fine-tuned on one dataset retains substantial watermark detection power on unseen datasets. Together, these results establish MarkTune as a general strategy for embedding robust, high-quality watermarks into open-weight LMs.

Yizhou Zhao, Zhiwei Steven Wu, Adam Block• 2025

Related benchmarks

TaskDatasetResultRank
Watermark DetectabilityC4 RealNewsLike (Del-0.2) (test)
AUC94.8
28
Text generation quality and watermark detectabilityC4 RealNewsLike
AUC99.7
16
Text generation quality and watermark detectabilityELI5
AUC99.6
16
Watermark DetectabilityC4-RealNewsLike Dipper-1 (test)
AUC0.977
14
Watermark DetectabilityC4 RealNewsLike Dipper-2 (test)
AUC85.9
14
Watermark DetectabilityC4-RealNewsLike Translate (test)
AUC97.3
14
Watermark DetectabilityC4 RealNewsLike (Del-0.5) (test)
AUC78.3
14
Watermark DetectabilityC4 RealNewsLike (Sub-0.5) (test)
AUC80.9
14
Showing 8 of 8 rows

Other info

Follow for update