Efficient Provably Secure Linguistic Steganography via Range Coding

About

Linguistic steganography involves embedding secret messages within seemingly innocuous texts to enable covert communication. Provable security, which is a long-standing goal and key motivation, has been extended to language-model-based steganography. Previous provably secure approaches have achieved perfect imperceptibility, measured by zero Kullback-Leibler (KL) divergence, but at the expense of embedding capacity. In this paper, we attempt to directly use a classic entropy coding method (range coding) to achieve secure steganography, and then propose an efficient and provably secure linguistic steganographic method with a rotation mechanism. Experiments across various language models show that our method achieves around 100% entropy utilization (embedding efficiency) for embedding capacity, outperforming the existing baseline methods. Moreover, it achieves high embedding speeds (up to 1554.66 bits/s on GPT-2). The code is available at github.com/ryehr/RRC_steganography.

Ruiyi Yan, Yugo Murawaki• 2026

Related benchmarks

Task	Dataset	Result
Linguistic Steganography	Llama 7b 2	Avg KLD0.00e+0	10
Linguistic Steganography	OPT-1.3B	Average KLD0.00e+0	10
Linguistic Steganography	GPT-2	Avg KLD (bits/token)0.00e+0	10

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord