Pseudorandom Error-Correcting Codes

About

We construct pseudorandom error-correcting codes (or simply pseudorandom codes), which are error-correcting codes with the property that any polynomial number of codewords are pseudorandom to any computationally-bounded adversary. Efficient decoding of corrupted codewords is possible with the help of a decoding key. We build pseudorandom codes that are robust to substitution and deletion errors, where pseudorandomness rests on standard cryptographic assumptions. Specifically, pseudorandomness is based on either $2^{O(\sqrt{n})}$-hardness of LPN, or polynomial hardness of LPN and the planted XOR problem at low density. As our primary application of pseudorandom codes, we present an undetectable watermarking scheme for outputs of language models that is robust to cropping and a constant rate of random substitutions and deletions. The watermark is undetectable in the sense that any number of samples of watermarked text are computationally indistinguishable from text output by the original model. This is the first undetectable watermarking scheme that can tolerate a constant rate of errors. Our second application is to steganography, where a secret message is hidden in innocent-looking content. We present a constant-rate stateless steganography scheme with robustness to a constant rate of substitutions. Ours is the first stateless steganography scheme with provable steganographic security and any robustness to errors.

Miranda Christ, Sam Gunn• 2024

Related benchmarks

Task	Dataset	Result
Watermark Detection	Llama-3 8B Instruct 30 tokens (generations)	Mean Precision17	13
Watermark Detection	Llama-3-8B-Instruct 150 tokens (generations)	Mean P1.6	13
Watermark Detection	Llama-3-8B Delete perturbation, 30 tokens 1.0 (test)	Mean P0.2	6
Watermark Detection	Llama-3-8B Delete perturbation, 150 tokens 1.0 (test)	Mean P0.29	6
Watermark Detection	Llama-3-8B Translate perturbation, 150 tokens 1.0 (test)	Mean P0.093	6
Watermark Detection Robustness	Llama-3-8B Delete 30%, 30 Tokens	Mean P0.26	6
Watermark Detection Robustness	Llama-3-8B Delete 30%, 150 Tokens	Mean P0.31	6
Watermark Detection Robustness	Llama-3-8B Delete 50%, 30 Tokens	Mean P0.38	6
Watermark Detection Robustness	Llama-3-8B Delete 50%, 150 Tokens	Mean P0.36	6
Watermark Detection	Llama-3-8B Translate perturbation, 30 tokens 1.0 (test)	Mean P0.052	6

Showing 10 of 21 rows

Other info

Follow for update

@wizwand_team Discord