Black-Box Detection of LLM-Generated Text Using Generalized Jensen-Shannon Divergence

About

We study black-box detection of machine-generated text under practical constraints: the scoring model (proxy LM) may mismatch the unknown source model, and per-input contrastive generation is costly. We propose SurpMark, a reference-based detector that summarizes a passage by the dynamics of its token surprisals. SurpMark discretizes surprisals into interpretable states, estimates a state-transition matrix for the test text, and scores it via a generalized Jensen-Shannon (GJS) gap between the test transitions and two fixed references (human vs. machine) built once from existing corpora. Theoretically, we derive design guidance for how the discretization bins should scale with data and provide a principled justification for our test statistic. Empirically, across multiple datasets, source models, and scenarios, SurpMark consistently matches or surpasses baselines, demonstrating strong robustness across domains and generators; our experiments on hyperparameter sensitivity exhibit trends that our theoretical results help to explain.

Shuangyi Chen, Ashish Khisti• 2025

Related benchmarks

Task	Dataset	Result
Detection of LLM generated text	WritingPrompts GPT-J-6B	AUROC97.6	15
Detection of LLM generated text	XSum GPT-J-6B	AUROC88.35	15
LLM-generated text detection	Xsum, WritingPrompts, and SQuAD Gemini-1.5-Flash (test)	AUROC75.14	15
LLM-generated text detection	Xsum, WritingPrompts, and SQuAD generated by GPT-4.1-mini (test)	AUROC80.25	15
LLM-generated text detection	Xsum, WritingPrompts, and SQuAD generated by GPT-5-Chat (test)	AUROC81.33	15
LLM-generated text detection	Xsum, WritingPrompts, and SQuAD Aggregated (test)	GPT2-XL98.35	15
Machine-generated text detection	DetectRL-arXiv cross-source corruption (test)	AUROC93.86	9
LLM-generated text detection	XSum GPT-5-Chat	TPR @ FPR=1%31.33	3
LLM-generated text detection	WritingPrompts GPT-4.1-mini	TPR @ FPR=1%31.33	3
LLM-generated text detection	WritingPrompts Llama3-8B	TPR @ FPR=1%100	3

Showing 10 of 25 rows

Other info

Follow for update

@wizwand_team Discord