Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM-generated content detection on HumanEval (AUROC)
Loading...
0.8173
AUROC
RAIDAR
0.405876
0.512688
0.6195
0.726312
May 7, 2026
AUROC
Updated 26d ago
Evaluation Results
Method
Method
Links
AUROC
RAIDAR
2026.05
0.8173
LiSCP
paraphrase model=GPT-3...
2026.05
0.8108
Fast-DetectGPT
2026.05
0.6679
R-Detect
2026.05
0.649
Ghostbuster
substitute LLM=GPT-3.5...
2026.05
0.5378
LogRank
2026.05
0.535
Rank
2026.05
0.5348
DetectGPT
perturbation model=T5-3B
2026.05
0.5267
RoBERTa-large
Backbone=large
2026.05
0.4692
Entropy
2026.05
0.4306
RoBERTa-base
Backbone=base
2026.05
0.4217
Feedback
Search any
task
Search any
task