Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Creative Writing

Benchmarks

Task NameDataset NameSOTA ResultTrend
Creative WritingCreative Writing
Win Rate58.6
36
Creative WritingCreative writing (test)
Creativity90.38
20
Creative WritingCreative Writing
Solved Rate51.98
16
Creative WritingCreative Writing EQ-Bench v3
ELO829.05
13
Creative WritingCreative Writing
Discovery Score45.2
12
Creative WritingCreative Writing v3
Overall Rubric Score52.8
10
Creative WritingCreative Writing Human Evaluation
Human Preference Count75
9
AI Text DetectionCreative Writing
AUC99.9
7
Creative WritingCreative Writing
Win Rate vs Confidence70.1
6
AI-generated text detectionCreative Writing Out-of-Domain
F1 Score95.7
5
AI-generated text detectionCreative Writing In-Domain
F1 Score98.4
5
Creative WritingCreative Writing Alpaca-Eval 100 problems 2.0
Length-Controlled Win Rate93.81
4
Showing 12 of 12 rows