Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM-generated text detection on Xsum, WritingPrompts, and SQuAD (GPT-5-Chat test)
Loading...
81.33
AUROC
SurpMark
40.3852
51.0151
61.645
72.2749
Oct 8, 2025
AUROC
Updated 21d ago
Evaluation Results
Method
Method
Links
AUROC
SurpMark
Black-box setting=true...
2025.10
81.33
SurpMark
Black-box setting=true...
2025.10
78.33
R-Detect
Black-box setting=true
2025.10
67.75
FourierGPT
Black-box setting=true
2025.10
64.82
DetectNPR
Black-box setting=true...
2025.10
54.99
DetectGPT
Black-box setting=true...
2025.10
54.6
LogRank
Black-box setting=true
2025.10
49.83
DetectLRR
Black-box setting=true
2025.10
49.83
DNA-GPT
Black-box setting=true...
2025.10
49.82
Binoculars
Black-box setting=true
2025.10
49.65
Likelihood
Black-box setting=true
2025.10
49.62
Entropy
Black-box setting=true
2025.10
46.99
Lastde++
Black-box setting=true...
2025.10
43.51
Fast-DetectGPT
Black-box setting=true
2025.10
43.39
Lastde
Black-box setting=true
2025.10
41.96
Feedback
Search any
task
Search any
task