Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WritingPrompts

Benchmarks

Task NameDataset NameSOTA ResultTrend
LGT DetectionWritingPrompts small Fast-DetectGPT benchmark (test)
AUROC99.9
54
LGT DetectionWritingPrompts-small Fast-DetectGPT benchmark
AUROC99.9
54
Language ModelingWritingPrompts
MAUVE32
33
Machine-generated text detectionWritingPrompts
AUROC1
30
LLM Text AttributionWritingPrompts
TPR (FPR=0.01)100
18
Story GenerationWritingPrompts
Brier Accuracy (BA)98.44
16
Text GenerationWritingPrompts
ROUGE-134.6
15
Detection of LLM-Generated TextWritingPrompts GPT-J-6B
AUROC97.6
15
Watermark DetectionWritingPrompts English (test)
TPR@FPR5%98.8
15
Language ModelingWritingPrompts (test)
Diversity (div)88
14
Multi-branch story generationWritingPrompts
Diversity3.7
10
Text generationWritingPrompts
F1 Score22.11
10
Open-ended Text GenerationWritingPrompts
PPL1.76
10
Text GenerationWritingPrompts (WP) (test)
BLEU-10.224
10
Narrative Script RefinementWritingPrompts
Character Development20.64
8
Output Sequence Length PredictionWritingPrompts super-long sequences (> 17k tokens) OOD
MAE195.89
8
Story GenerationWritingPrompts (test)
PPL (Generation)8.14
5
AI-Generated Text DetectionWritingPrompts
TPR94.23
5
LLM-generated text detectionWritingPrompts Fast-DetectGPT
AUROC98.8
5
Multi-branch story generationWritingPrompts
BLEU58.6
4
LLM-generated text detectionWritingPrompts Llama3-8B
TPR @ FPR=1%100
3
LLM-generated text detectionWritingPrompts GPT-4.1-mini
TPR @ FPR=1%31.33
3
Machine-generated text detectionWritingPrompts cross-source corruption DetectRL (test)
AUROC84.15
3
Story Generation EvaluationWritingPrompts (WP) (test)
Fascination73.88
2
Open-ended Text GenerationWritingPrompts (test)
Same Count85
2
Showing 25 of 25 rows