Text Generation

Benchmarks

Dataset Name	SOTA Method	Metric
OpenWebText	SDTT	Perplexity3.18	187	18d ago
DomainBench	SYTTA-16	BLEU (Agriculture)71.37	144	4mo ago
WikiText-2		Perplexity4.88	138	1mo ago
LM1B (test)	UDLM	Entropy2.46	90	1mo ago
NoveltyBench		Diversity10	81	2mo ago
CNN/Daily Mail (test)	DEL	COH Score83	64	4mo ago
GSM8K	LoRA + PCGrad	Accuracy86.4	63	1mo ago
OpenWebText	IDLM-MDLM	Gen PPL11.312	54	1mo ago
DFM 1024 samples masked backbone	TR-CIE	GPT-2 Perplexity135.92	42	1mo ago
Medical Chatbot		ASR100	42	4mo ago
OWT	DFM (ESD)	GPT2 Perplexity5.33	41	3mo ago
LM1B	DFM (ESD)	Perplexity (PPL)68.11	39	18d ago
5 Generation tasks	POP	Accuracy57.96	36	4mo ago
Masked DFM backbone 1024 samples	θ-Trapezoidal	Unigram Entropy7.61	35	1mo ago
Text Generation		PPL11.9	33	4mo ago
Text model inference M4 Max	vllm-mlx	Throughput (tok/s)525.5	31	4mo ago
OpenWebText	Parallel (Ours)	Generative Perplexity22.099	30	23d ago
1024 samples	TR-CIE	Perplexity (GPT-2)104.44	30	1mo ago
uniform DFM backbone 1024 samples	TR-CIE	LLaMA-3 Perplexity95.56	30	1mo ago
MSCOCO	SARE	BLEU-157.2	26	4mo ago
Math-500	JetFlow	Throughput (TPS)1,094.6	25	1mo ago
IFEval	Llama 3.1 8B Instruct	Accuracy74.49	23	1mo ago
Wikitext-103	Refined by Gemma3 27B	Perplexity32.88	23	3mo ago
Spec-Bench Overall	SpecBound	SD Score2.33	21	3mo ago
MMLU (test)	TAP	BS Score57.83	20	2mo ago

Showing 25 of 211 rows

...