Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
About
Detecting text generated by modern large language models is thought to be hard, as both LLMs and humans can exhibit a wide range of complex behaviors. However, we find that a score based on contrasting two closely related language models is highly accurate at separating human-generated and machine-generated text. Based on this mechanism, we propose a novel LLM detector that only requires simple calculations using a pair of pre-trained LLMs. The method, called Binoculars, achieves state-of-the-art accuracy without any training data. It is capable of spotting machine text from a range of modern LLMs without any model-specific modifications. We comprehensively evaluate Binoculars on a number of text sources and in varied situations. Over a wide range of document types, Binoculars detects over 90% of generated samples from ChatGPT (and other LLMs) at a false positive rate of 0.01%, despite not being trained on any ChatGPT data.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| AI-generated text detection | READ (test) | Accuracy84.7 | 55 | |
| Machine-generated text detection | TruthfulQA | TPR@FPR-1% (ChatGLM)93.93 | 54 | |
| Machine-generated text detection | Xsum | AUROC99 | 40 | |
| Machine-generated text detection | Essay (test) | GPT4All Score94.94 | 39 | |
| AI-generated text detection | AcademicResearch | AUC98.9 | 36 | |
| AI-generated text detection | Essay | AUROC (GPT4All)98.56 | 35 | |
| Machine-generated text detection | WritingPrompts | AUROC0.99 | 30 | |
| Machine-generated text detection | SQuAD | AUROC78 | 30 | |
| AI-generated text detection | M4 | AUROC90 | 27 | |
| AI-generated text detection | RealDet | AUROC93.67 | 27 |