Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MAGE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Machine-Generated Text DetectionMAGE
TP @ 20%85.12
18
Detection of Machine-Generated TextMAGE Main Experimental Supplement (test)
TP@20%85.12
14
Detection of Machine-Generated TextMAGE (test)
TP @ 20% Threshold85.12
14
Machine-Generated Text DetectionMAGE COLING2025 (val)
AUC79.55
13
Machine-generated text detectionMAGE Unseen Domains & Unseen Model (test)
Human Recall95.65
9
AI-generated text detectionMAGE BigScience 1.0 (test)
Accuracy96.7
8
AI-generated text detectionMAGE GLM 1.0 (test)
Accuracy94.1
8
AI-generated text detectionMAGE OPT 1.0 (test)
Accuracy89.1
8
AI-generated text detectionMAGE (LLaMA) 1.0 (test)
Accuracy88
8
AI-generated text detectionMAGE GPT 1.0 (test)
Accuracy82.7
8
AI-generated text detectionMAGE FLAN-T5 1.0 (test)
Accuracy68.9
8
Detection of LLM-generated textMAGE Topic-based 3.5-turbo
Detection Accuracy100
8
Detection of LLM-generated textMAGE News Topic-based 3.5-turbo
Detection Performance99.95
8
LLM-generated text detectionMAGE QA short text (<= 30 words)
AUROC0.9747
8
LLM-generated text detectionMAGE News short text (<= 30 words)
AUROC93.48
8
Detection of LLM generated textMAGE QA
ROC AUC (FPR=1%)65.33
8
Detection of LLM generated textMAGE News
ROC AUC @ FPR=1%0.6577
8
LLM-generated text detectionMAGE DIPPER attack
Human Score77.44
8
Detection EvasionMAGE
TPR@1% (R)22.5
6
Machine-generated text detectionMAGE Arbitrary-domains & Arbitrary-models (test)
Human Recall0.9572
5
Machine-generated text detectionMAGE Paraphrasing Attack (test)
Human Recall79.66
4
AI-Generated Text DetectionMAGE DeepSeek-R1 OOD
Accuracy71
3
AI-Generated Text DetectionMAGE Claude-sonnet-4-5 OOD
Accuracy57.1
3
AI-Generated Text DetectionMAGE GPT-5 OOD
Accuracy68.7
3
AI-Generated Text DetectionMAGE GPT-4 OOD
Accuracy61.8
3
Showing 25 of 25 rows