Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood
About
Human and model-generated texts can be distinguished by examining the magnitude of likelihood in language. However, it is becoming increasingly difficult as language model's capabilities of generating human-like texts keep evolving. This study provides a new perspective by using the relative likelihood values instead of absolute ones, and extracting useful features from the spectrum-view of likelihood for the human-model text detection task. We propose a detection procedure with two classification methods, supervised and heuristic-based, respectively, which results in competitive performances with previous zero-shot detection methods and a new state-of-the-art on short-text detection. Our method can also reveal subtle differences between human and model languages, which find theoretical roots in psycholinguistics studies. Our code is available at https://github.com/CLCS-SUSTech/FourierGPT
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| LLM-generated text detection | EvoBench | LLaMA3 Score63.99 | 26 | |
| Machine-generated text detection | MAGE | -- | 18 | |
| Machine-generated text detection | DetectRL Training Text: ChatGPT | -- | 12 | |
| AI-generated text detection | Reuters | GPT4All Score99.28 | 8 | |
| AI-generated text detection | Essay | AUROC (GPT4All)91.7 | 8 | |
| Machine-generated text detection | DetectRL Llama-2-70b | AUROC0.5811 | 6 | |
| Machine-generated text detection | DetectRL Google-PaLM | AUROC59.99 | 6 |