Beyond the Final Actor: Modeling the Dual Roles of Creator and Editor for Fine-Grained LLM-Generated Text Detection

About

The misuse of large language models (LLMs) requires precise detection of synthetic text. Existing works mainly follow binary or ternary classification settings, which can only distinguish pure human/LLM text or collaborative text at best. This remains insufficient for the nuanced regulation, as the LLM-polished human text and humanized LLM text often trigger different policy consequences. In this paper, we explore fine-grained LLM-generated text detection under a rigorous four-class setting. To handle such complexities, we propose RACE (Rhetorical Analysis for Creator-Editor Modeling), a fine-grained detection method that characterizes the distinct signatures of creator and editor. Specifically, RACE utilizes Rhetorical Structure Theory (RST) to construct a logic graph for the creator's foundation while extracting Elementary Discourse Unit (EDU)-level features for the editor's style. Experiments show that RACE outperforms 12 baselines in identifying fine-grained types with low false alarms, offering a policy-aligned solution for LLM regulation.

Yang Li, Qiang Sheng, Zhengjia Wang, Yehan Yang, Danding Wang, Juan Cao• 2026

Related benchmarks

Task	Dataset	Result
Fine-Grained LLM-Generated Text Detection	HART 4-class setting	AUROC97.99	13
LLM-generated text detection	HART (default random split)	Avg TPR @ 5% FPR94.41	12
LLM-generated text detection	HART (group-aware split)	AUROC96.59	4
LLM-generated text detection	HART Arxiv domain (Leave-One-Domain-Out)	AUROC96.61	3
LLM-generated text detection	HART Essay domain (Leave-One-Domain-Out)	AUROC95.88	3
LLM-generated text detection	HART News domain (Leave-One-Domain-Out)	AUROC92.69	3
LLM-generated text detection	HART Writing domain (Leave-One-Domain-Out)	AUROC86.2	3

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord