Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Xsum, WritingPrompts, and SQuAD

Benchmarks

Task NameDataset NameSOTA ResultTrend
LLM-generated text detectionXsum, WritingPrompts, and SQuAD Aggregated (test)
GPT2-XL99.46
15
LLM-generated text detectionXsum, WritingPrompts, and SQuAD generated by GPT-5-Chat (test)
AUROC81.33
15
LLM-generated text detectionXsum, WritingPrompts, and SQuAD generated by GPT-4.1-mini (test)
AUROC80.25
15
LLM-generated text detectionXsum, WritingPrompts, and SQuAD Gemini-1.5-Flash (test)
AUROC75.14
15
Showing 4 of 4 rows