Human Texts Are Outliers: Detecting LLM-generated Texts via Out-of-distribution Detection
About
The rapid advancement of large language models (LLMs) such as ChatGPT, DeepSeek, and Claude has significantly increased the presence of AI-generated text in digital communication. This trend has heightened the need for reliable detection methods to distinguish between human-authored and machine-generated content. Existing approaches both zero-shot methods and supervised classifiers largely conceptualize this task as a binary classification problem, often leading to poor generalization across domains and models. In this paper, we argue that such a binary formulation fundamentally mischaracterizes the detection task by assuming a coherent representation of human-written texts. In reality, human texts do not constitute a unified distribution, and their diversity cannot be effectively captured through limited sampling. This causes previous classifiers to memorize observed OOD characteristics rather than learn the essence of `non-ID' behavior, limiting generalization to unseen human-authored inputs. Based on this observation, we propose reframing the detection task as an out-of-distribution (OOD) detection problem, treating human-written texts as distributional outliers while machine-generated texts are in-distribution (ID) samples. To this end, we develop a detection framework using one-class learning method including DeepSVDD and HRN, and score-based learning techniques such as energy-based method, enabling robust and generalizable performance. Extensive experiments across multiple datasets validate the effectiveness of our OOD-based approach. Specifically, the OOD-based method achieves 98.3% AUROC and AUPR with only 8.9% FPR95 on DeepFake dataset. Moreover, we test our detection framework on multilingual, attacked, and unseen-model and -domain text settings, demonstrating the robustness and generalizability of our framework. Code, pretrained weights, and demo will be released.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Machine-generated text detection | MAGE | AUROC (Avg)98 | 24 | |
| LLM-generated text detection | DetectRL | -- | 12 | |
| AI Text Detection | M4GT | AUROC76.1 | 10 | |
| AI Text Detection | Ghostbuster | AUROC94 | 10 | |
| AI Text Detection | HC3 | AUROC99.3 | 10 | |
| LLM-generated text detection | MELD GPT-5.4-Mini (eval) | TPR @ 1% FPR2.7 | 10 | |
| LLM-generated text detection | MELD-eval Gemini-3-Flash | TPR@1%FPR1.6 | 10 | |
| LLM-generated text detection | MELD Overall (eval) | TPR @ 1% FPR1.6 | 10 | |
| AI Text Detection | MELD (eval) | AUROC65.8 | 10 | |
| LLM-generated text detection | MELD Qwen-3.6-Plus (eval) | TPR @ 1% FPR1.1 | 10 |