Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DSIPA: Detecting LLM-Generated Texts via Sentiment-Invariant Patterns Divergence Analysis

About

The rapid advancement of large language models (LLMs) presents new security challenges, particularly in detecting machine-generated text used for misinformation, impersonation, and content forgery. Most existing detection approaches struggle with robustness against adversarial perturbation, paraphrasing attacks, and domain shifts, often requiring restrictive access to model parameters or large labeled datasets. To address this, we propose DSIPA, a novel training-free framework that detects LLM-generated content by quantifying sentiment distributional stability under controlled stylistic variation. It is based on the observation that LLMs typically exhibit more emotionally consistent outputs, while human-written texts display greater affective variation. Our framework operates in a zero-shot, black-box manner, leveraging two unsupervised metrics, sentiment distribution consistency and sentiment distribution preservation, to capture these intrinsic behavioral asymmetries without the need for parameter updates or probability access. Extensive experiments are conducted on state-of-the-art proprietary and open-source models, including GPT-5.2, Gemini-1.5-pro, Claude-3, and LLaMa-3.3. Evaluations on five domains, such as news articles, programming code, student essays, academic papers, and community comments, demonstrate that DSIPA improves F1 detection scores by up to 49.89% over baseline methods. The framework exhibits superior generalizability across domains and strong resilience to adversarial conditions, providing a robust and interpretable behavioral signal for secure content identification in the evolving LLM landscape.

Siyuan Li, Aodu Wulianghai, Guangyan Li, Xi Lin, Qinghua Mao, Yuliang Chen, Jun Wu, Jianhua Li• 2026

Related benchmarks

TaskDatasetResultRank
LLM-generated content detectionReuter News
Std Dev of F1 Score3.91
12
LLM-generated content detectionHumanEval
Std Dev of F1 Score4.87
12
LLM-generated content detectionStudent Essay
Stdev(F1 Score)4.26
12
LLM-generated content detectionAcademic Paper
Std Dev of F1 Score4.68
12
LLM-generated text detectionReuter News Dataset
F1 (Original)89.2
12
LLM-generated text detectionYelp review dataset
F1 (Original)89.8
12
LLM-generated content detectionYelp Review
Stdev of F1 Score5.11
12
Showing 7 of 7 rows

Other info

Follow for update