Advancing Beyond Identification: Multi-bit Watermark for Large Language Models

About

We show the viability of tackling misuses of large language models beyond the identification of machine-generated text. While existing zero-bit watermark methods focus on detection only, some malicious misuses demand tracing the adversary user for counteracting them. To address this, we propose Multi-bit Watermark via Position Allocation, embedding traceable multi-bit information during language model generation. Through allocating tokens onto different parts of the messages, we embed longer messages in high corruption settings without added latency. By independently embedding sub-units of messages, the proposed method outperforms the existing works in terms of robustness and latency. Leveraging the benefits of zero-bit watermarking, our method enables robust extraction of the watermark without any model access, embedding and extraction of long messages ($\geq$ 32-bit) without finetuning, and maintaining text quality, while allowing zero-bit detection all at the same time. Code is released here: https://github.com/bangawayoo/mb-lm-watermarking

KiYoon Yoo, Wonhyuk Ahn, Nojun Kwak• 2023

Related benchmarks

Task	Dataset	Result
Fake News Detection	FAKE NEWS	Accuracy93.31	66
Watermark Detection	mmw story	Accuracy99.61	48
Watermark Detection	fake_news	Accuracy97.94	48
Watermark Detection	book_report	Accuracy98.31	48
Watermark Detection	finance_qa	Accuracy92.22	48
Watermark Detection	longform_qa	Accuracy90.13	48
Watermark Detection	dolly_cw	Accuracy90.81	48
Text Completion	Text Completion	Binary Accuracy98.25	27
Text Summarization	Text Summarization	BA82	24
Detection Accuracy	mmw story	Accuracy97.66	24

Showing 10 of 79 rows

...

Other info

Follow for update

@wizwand_team Discord