One Billion Word

Benchmarks

Task Name	Dataset Name	SOTA Result
Language Modeling	One Billion Word (OBW) 100% train set (test)	PPL39.15	11
Language Modeling	One Billion Word (OBW) 1% train set (test)	PPL63.83	11
Language Modeling	One Billion Word corpus 1M sentences	Perplexity71.8	5
Text Generation	One Billion Word (test)	4-gram JSD0.22	2
Language Modeling	One Billion Word Benchmark (train)	Perplexity36.39	2
Language Generation	One Billion Word 6-gram	JSD0.74	2
Language Generation	One Billion Word 4-gram	JSD0.35	2

Showing 7 of 7 rows