Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

XSum, SQuAD, WritingPrompts

Benchmarks

Task NameDataset NameSOTA ResultTrend
AI-generated text detectionXSum, SQuAD, WritingPrompts Average across 12 source models
AUROC99.07
11
AI-generated text detectionXSum, SQuAD, WritingPrompts Phi2-2.7 generated
AUROC98.75
11
AI-generated text detectionXSum, SQuAD, WritingPrompts Gemma-7 generated
AUROC98.91
11
AI-generated text detectionXSum, SQuAD, WritingPrompts Falcon-7 generated
AUROC99.71
11
AI-generated text detectionXSum, SQuAD, WritingPrompts Bloom-7.1 generated
AUROC99.94
11
AI-generated text detectionXSum, SQuAD, WritingPrompts OPT-13 generated
AUROC99.33
11
AI-generated text detectionXSum SQuAD WritingPrompts Llama3-8 generated
AUROC99.9
11
AI-generated text detectionXSum, SQuAD, WritingPrompts Llama2-13 generated
AUROC98.01
11
AI-generated text detectionXSum, SQuAD, WritingPrompts Llama-13 generated
AUROC97.41
11
AI-generated text detectionXSum, SQuAD, WritingPrompts GPT-J generated
AUROC99.74
11
AI-generated text detectionXSum, SQuAD, WritingPrompts OPT-2.7 generated
AUROC99.67
11
AI-generated text detectionXSum, SQuAD, WritingPrompts Neo-2.7 generated
AUROC99.88
11
AI-generated text detectionXSum, SQuAD, WritingPrompts GPT-2 generated
AUROC99.72
11
Showing 13 of 13 rows