SeqXGPT: Sentence-Level AI-Generated Text Detection
About
Widely applied large language models (LLMs) can generate human-like content, raising concerns about the abuse of LLMs. Therefore, it is important to build strong AI-generated text (AIGT) detectors. Current works only consider document-level AIGT detection, therefore, in this paper, we first introduce a sentence-level detection challenge by synthesizing a dataset that contains documents that are polished with LLMs, that is, the documents contain sentences written by humans and sentences modified by LLMs. Then we propose \textbf{Seq}uence \textbf{X} (Check) \textbf{GPT}, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection. These features are composed like \textit{waves} in speech processing and cannot be studied by LLMs. Therefore, we build SeqXGPT based on convolution and self-attention networks. We test it in both sentence and document-level detection challenges. Experimental results show that previous methods struggle in solving sentence-level AIGT detection, while our method not only significantly surpasses baseline methods in both sentence and document-level detection challenges but also exhibits strong generalization capabilities.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| AI-generated text detection | Cross-genre (test) | OA88 | 32 | |
| AIGT detection | HC3 PWWS attack, AI to Human (in-domain) | Overall Accuracy99.75 | 28 | |
| AI-generated text detection | mixed-source AI -> Human GPT-2, GPT-Neo, GPT-J, LLaMa, GPT-3 | Overall Accuracy96.5 | 26 | |
| AI-generated text detection | HC3 (test) | F1 (Overall)99.33 | 18 | |
| Authorship segmentation | MAS (entire corpus) | SBDA @ 0.331.5 | 18 | |
| AI-generated text detection | Cross-genre AIGT Overall (test) | OA86.75 | 14 | |
| AIGT detection | HC3 PWWS attack, Human to AI (in-domain) | OA100 | 14 | |
| AIGT detection | HC3 Deep-Word-Bug attack Human to AI (in-domain) | Overall Accuracy100 | 14 | |
| AIGT detection | HC3 Pruthi attack Human to AI (in-domain) | OA1 | 14 | |
| AIGT detection | cross-domain AIGT detection AI -> Human | Overall Accuracy93 | 14 |