Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature

About

The increasing fluency and widespread usage of large language models (LLMs) highlight the desirability of corresponding tools aiding detection of LLM-generated text. In this paper, we identify a property of the structure of an LLM's probability function that is useful for such detection. Specifically, we demonstrate that text sampled from an LLM tends to occupy negative curvature regions of the model's log probability function. Leveraging this observation, we then define a new curvature-based criterion for judging if a passage is generated from a given LLM. This approach, which we call DetectGPT, does not require training a separate classifier, collecting a dataset of real or generated passages, or explicitly watermarking generated text. It uses only log probabilities computed by the model of interest and random perturbations of the passage from another generic pre-trained language model (e.g., T5). We find DetectGPT is more discriminative than existing zero-shot methods for model sample detection, notably improving detection of fake news articles generated by 20B parameter GPT-NeoX from 0.81 AUROC for the strongest zero-shot baseline to 0.95 AUROC for DetectGPT. See https://ericmitchell.ai/detectgpt for code, data, and other project information.

Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D. Manning, Chelsea Finn• 2023

Related benchmarks

TaskDatasetResultRank
Machine-generated text detectionMGT benchmark Essay
AUROC64.4
129
LGT DetectionFast-DetectGPT XSum (test)
AUROC93.2
96
LGT DetectionFast-DetectGPT PubMed (test)
AUROC0.744
96
LGT DetectionWritingPrompts-small Fast-DetectGPT benchmark
AUROC95.5
54
LGT DetectionWritingPrompts small Fast-DetectGPT benchmark (test)
AUROC95.5
54
LGT DetectionXSum Fast-DetectGPT benchmark
AUROC93.2
54
LGT DetectionPubMed Fast-DetectGPT benchmark
AUROC0.744
54
LGT DetectionMGTBench WritingPrompts
AUROC63.9
45
Machine-generated text detectionMGT benchmark Reuters
AUROC73
45
AI-generated text detectionLong-form QA 3K generations corpus
Detection Accuracy (1% FPR)74.9
42
Showing 10 of 171 rows
...

Other info

Follow for update