Story

Benchmarks

Task Name	Dataset Name	SOTA Result
Creative Writing	Story	Semantic Diversity38.6	20
Story generation	Story	Diversity8.36	19
Question Answering	Story	Exact Match (EM)46.7	14
Single change-point detection	Story	WD0.207	12
Open-ended Text Generation	Story (test)	Diversity (DIV)0.96	12
Machine Text Detection	Story	Rewrite AUC (Claude 3.5)0.998	11
Multiple change-point detection	Story dataset GPT-5-mini K=5	WD0.44	6
Text Generation	Story	Coherence Win Rate63.6	4

Showing 8 of 8 rows