Open-ended Text Generation

Benchmarks

Dataset Name	SOTA Method	Metric
WikiText-2	DEL	COH Score0.811	112	4mo ago
PTB	DEL	COH Score67.7	64	4mo ago
WikiText	ϵ-sampling + VCM	Rep-254.83	18	1mo ago
Law-MT Out of Domain (test)	FoSS	MAUVE32.17	16	4mo ago
Scaling Data Store	FoSS	MAUVE33.79	12	4mo ago
Story (test)	Typical Sampling	Diversity (DIV)0.96	12	4mo ago
Wikinews (test)	Typical Sampling	Diversity (DIV)0.95	12	4mo ago
Wikitext (test)	Typical Sampling	Diversity (DIV)95	12	4mo ago
CEB	FairSteer	Sentiment Score80	12	4mo ago
FinRED		Diversity98.9	11	4mo ago
ArXiv Dataset	InferDPT + RANTEXT	Diversity94.8	11	4mo ago
Wikitext-103 v1		Diversity98.7	11	4mo ago
CNN/Daily Mail		Diversity98.3	11	4mo ago
WritingPrompts		PPL1.76	10	4mo ago
Wikitext-103		PPL2.55	10	4mo ago
MTBench	DECA	MT-1 Score4.88	8	1mo ago
Wikitext-103 (test)	DITTO	Win Rate84	8	4mo ago
WikiNews	MACD	MAUVE92	5	1mo ago
WildBench	CTC-trained MDLM	Score-1.7	4	4mo ago
Creative-Writing-Bench v3	CTC-trained MDLM	Score27.4	4	4mo ago
Arena-hard Creative-Writing	CTC-trained MDLM	Pairwise Win Rate80.2	4	4mo ago
Arena-hard Hard-Prompt	LLaDA-1.5	Pairwise Win Rate58.5	4	4mo ago
Chatbot Arena inspired qualitative prompts (val)	Mamba	ELO1,150.78	4	4mo ago
HalluDial	+DFT+DFD	BERTScore76.81	3	4mo ago
BookCorpus Story Project Gutenberg	MACD	MAUVE0.91	2	1mo ago

Showing 25 of 28 rows