Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM-generated text detection on Xsum, WritingPrompts, and SQuAD generated by GPT-4.1-mini (test)
Loading...
80.25
AUROC
SurpMark
37.0588
48.2719
59.485
70.6981
Oct 8, 2025
AUROC
Updated 21d ago
Evaluation Results
Method
Method
Links
AUROC
SurpMark
Black-box setting=true...
2025.10
80.25
SurpMark
Black-box setting=true...
2025.10
78.48
R-Detect
Black-box setting=true
2025.10
71.64
Binoculars
Black-box setting=true
2025.10
71.12
DetectNPR
Black-box setting=true...
2025.10
70.83
DetectGPT
Black-box setting=true...
2025.10
70.08
Fast-DetectGPT
Black-box setting=true
2025.10
68.32
Lastde++
Black-box setting=true...
2025.10
68.23
LogRank
Black-box setting=true
2025.10
66.8
Likelihood
Black-box setting=true
2025.10
66.77
DetectLRR
Black-box setting=true
2025.10
63.29
FourierGPT
Black-box setting=true
2025.10
63.05
Lastde
Black-box setting=true
2025.10
57.28
DNA-GPT
Black-box setting=true...
2025.10
56.71
Entropy
Black-box setting=true
2025.10
38.72
Feedback
Search any
task
Search any
task