Training-free LLM-generated Text Detection by Mining Token Probability Sequences

About

Large language models (LLMs) have demonstrated remarkable capabilities in generating high-quality texts across diverse domains. However, the potential misuse of LLMs has raised significant concerns, underscoring the urgent need for reliable detection of LLM-generated texts. Conventional training-based detectors often struggle with generalization, particularly in cross-domain and cross-model scenarios. In contrast, training-free methods, which focus on inherent discrepancies through carefully designed statistical features, offer improved generalization and interpretability. Despite this, existing training-free detection methods typically rely on global text sequence statistics, neglecting the modeling of local discriminative features, thereby limiting their detection efficacy. In this work, we introduce a novel training-free detector, termed \textbf{Lastde} that synergizes local and global statistics for enhanced detection. For the first time, we introduce time series analysis to LLM-generated text detection, capturing the temporal dynamics of token probability sequences. By integrating these local statistics with global ones, our detector reveals significant disparities between human and LLM-generated texts. We also propose an efficient alternative, \textbf{Lastde++} to enable real-time detection. Extensive experiments on six datasets involving cross-domain, cross-model, and cross-lingual detection scenarios, under both white-box and black-box settings, demonstrated that our method consistently achieves state-of-the-art performance. Furthermore, our approach exhibits greater robustness against paraphrasing attacks compared to existing baseline methods.

Yihuai Xu, Yongwei Wang, Yifei Bi, Huangsen Cao, Zhouhan Lin, Yu Zhao, Fei Wu• 2024

Related benchmarks

Task	Dataset	Result
AI-generated text detection	M4	AUROC91.43	41
Machine-generated text detection	Xsum	AUROC96.59	40
AI-generated text detection	Essay	AUROC (GPT4All)99.54	35
Machine-generated text detection	WritingPrompts	--	30
AI-generated text detection	RealDet	AUROC93.9	27
AI-generated text detection	DetectRL Multi-LLM	AUROC75.36	27
AI-generated text detection	DetectRL Multi-Domain	AUROC67.3	27
LLM-generated text detection	EvoBench	LLaMA3 Score90.8	26
Machine-generated text detection	MGTEVAL SemEval 2024 human text & Qwen3 machine-generated text 1.0 (test)	Accuracy91.5	26
Machine-generated text detection	MAGE	AUROC (Avg)70.71	24

Showing 10 of 70 rows

Other info

Follow for update

@wizwand_team Discord