Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DetectRL

Benchmarks

Task NameDataset NameSOTA ResultTrend
LLM-generated text detectionDetectRL Out-of-Domain Multi-Topic 1.0 (test)
Average Detection Score91.1
18
LLM-generated text detectionDetectRL Out-of-Domain Multi-LLM 1.0 (test)
Average Performance Score90.6
16
Machine-generated text detectionDetectRL Multi-LLM (in-domain)
Score (GPT-3.5)99.7
14
Machine-generated text detectionDetectRL Multi-Topic (in-domain)
arXiv Score1
14
AI-generated text detectionDetectRL Multi-Domain
AUROC91.51
13
AI-generated text detectionDetectRL Multi-LLM
AUROC90.67
13
Machine-generated text detectionDetectRL Training Text: Llama-2-70b (test)
Detection Score (Llama-2-70b)90.2
12
Machine-Generated Text DetectionDetectRL (test)
Detection Score (Llama-2-70b)50.56
12
Machine-Generated Text DetectionDetectRL Google-PaLM (train)
TPR@FPR-1% (Llama-2-70b)50.58
12
Machine-Generated Text DetectionDetectRL Training Text: ChatGPT
TPR@FPR-1% (Llama-2-70b)50.66
12
Machine-generated text detectionDetectRL Google-PaLM
AUROC77.8
6
Machine-generated text detectionDetectRL Llama-2-70b
AUROC0.8122
6
Showing 12 of 12 rows