Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CLASP: Training-Free LLM-Assisted Source Code Watermarking via Semantic-Preserving Transformations

About

The proliferation of open-source code and large language models (LLMs) for code generation has amplified the risks of unauthorized reuse and intellectual property infringement. Source code watermarking offers a potential solution, yet existing methods typically encode watermarks through identifiers, local code patterns, or limited handcrafted edits, leaving them vulnerable to renaming, refactoring, and adaptive watermark removal. These limitations hinder the joint achievement of robustness, capacity, generalization, and deployment efficiency. We propose CLASP, a Code LLM-Assisted Semantic-Preserving watermarking framework that enables training-free, plug-and-play watermarking for source code. CLASP embeds watermark bits within a fixed space of semantics-preserving transformations, enabling automated watermark insertion with higher capacity while remaining reusable across programming languages and less dependent on brittle lexical features. To recover the watermark, CLASP uses reference-code retrieval and differential comparison to identify transformation traces, avoiding task-specific model training while improving robustness to structural edits and adaptive attacks. Experiments across multiple programming languages show that CLASP consistently outperforms existing baselines in watermark extraction accuracy and robustness, while maintaining code quality under both random removal and adaptive de-watermarking attacks.

Rui Xu, Jiawei Chen, Weizhi Liu, Zhaoxia Yin, Cong Kong, Xinpeng Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Source Code WatermarkingMBCPP
Bit Accuracy99.64
32
Source Code WatermarkingMBJP
Bit Accuracy99.72
32
Source Code WatermarkingMBPP
Bit Accuracy97.85
18
Source Code WatermarkingCSN Java
Bit Accuracy98.05
8
Source Code WatermarkingQualitative Comparison
Capacity (BPF)32
5
Source Code WatermarkingMBJSP
BitAcc99.47
5
Source Code WatermarkingCSN JS
Bit Accuracy98.08
4
Source Code WatermarkingMBPP
Bit Accuracy97.85
2
Showing 8 of 8 rows

Other info

Follow for update