Neural Linguistic Steganography
About
Whereas traditional cryptography encrypts a secret message into an unintelligible form, steganography conceals that communication is taking place by encoding a secret message into a cover signal. Language is a particularly pragmatic cover signal due to its benign occurrence and independence from any one medium. Traditionally, linguistic steganography systems encode secret messages in existing text via synonym substitution or word order rearrangements. Advances in neural language models enable previously impractical generation-based techniques. We propose a steganography technique based on arithmetic coding with large-scale neural language models. We find that our approach can generate realistic looking cover sentences as evaluated by humans, while at the same time preserving security by matching the cover message distribution with the language model distribution.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Linguistic Steganography | Llama 7b 2 | Avg KLD6.92e-4 | 10 | |
| Linguistic Steganography | OPT-1.3B | Average KLD0.0019 | 10 | |
| Linguistic Steganography | GPT-2 | Avg KLD (bits/token)0.0019 | 10 |