Pseudorandom Error-Correcting Codes
About
We construct pseudorandom error-correcting codes (or simply pseudorandom codes), which are error-correcting codes with the property that any polynomial number of codewords are pseudorandom to any computationally-bounded adversary. Efficient decoding of corrupted codewords is possible with the help of a decoding key. We build pseudorandom codes that are robust to substitution and deletion errors, where pseudorandomness rests on standard cryptographic assumptions. Specifically, pseudorandomness is based on either $2^{O(\sqrt{n})}$-hardness of LPN, or polynomial hardness of LPN and the planted XOR problem at low density. As our primary application of pseudorandom codes, we present an undetectable watermarking scheme for outputs of language models that is robust to cropping and a constant rate of random substitutions and deletions. The watermark is undetectable in the sense that any number of samples of watermarked text are computationally indistinguishable from text output by the original model. This is the first undetectable watermarking scheme that can tolerate a constant rate of errors. Our second application is to steganography, where a secret message is hidden in innocent-looking content. We present a constant-rate stateless steganography scheme with robustness to a constant rate of substitutions. Ours is the first stateless steganography scheme with provable steganographic security and any robustness to errors.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Watermark Detection | Llama-3 8B Instruct 30 tokens (generations) | Mean Precision17 | 13 | |
| Watermark Detection | Llama-3-8B-Instruct 150 tokens (generations) | Mean P1.6 | 13 | |
| Watermark Detection | Llama-3-8B Delete perturbation, 30 tokens 1.0 (test) | Mean P0.2 | 6 | |
| Watermark Detection | Llama-3-8B Delete perturbation, 150 tokens 1.0 (test) | Mean P0.29 | 6 | |
| Watermark Detection | Llama-3-8B Translate perturbation, 150 tokens 1.0 (test) | Mean P0.093 | 6 | |
| Watermark Detection Robustness | Llama-3-8B Delete 30%, 30 Tokens | Mean P0.26 | 6 | |
| Watermark Detection Robustness | Llama-3-8B Delete 30%, 150 Tokens | Mean P0.31 | 6 | |
| Watermark Detection Robustness | Llama-3-8B Delete 50%, 30 Tokens | Mean P0.38 | 6 | |
| Watermark Detection Robustness | Llama-3-8B Delete 50%, 150 Tokens | Mean P0.36 | 6 | |
| Watermark Detection | Llama-3-8B Translate perturbation, 30 tokens 1.0 (test) | Mean P0.052 | 6 |