Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

WikiText-2

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingWikiText-2
Perplexity (PPL)1.61
841
Open-ended text generationWikiText-2
COH Score0.811
112
Language ModelingWikiText-2 raw (test)
Perplexity5.674
57
Language ModelingWikiText-2 v1 (val)
Perplexity42.41
19
Language ModelingWikitext 2 Llama 2 & 3 (test)
PPL (Llama 2, Config 7)5.47
16
Language ModelingWikitext-2 Standard (val)
Perplexity40.2
12
Causal PredictionWikiText-2 (val)
Min Validation Loss5.4856
11
Language ModelingWikiText-2
Perplexity (Baseline)11.02
8
Language GenerationWikiText-2 (test)
Perplexity3.319
8
Language ModelingWikiText-2 context length 2048 (test)
Perplexity7.15
7
Language ModelingWikitext-2 word-level (dev)
PPL66.9
7
Language ModelingWikiText-2 context length 8192 (test)
Perplexity6.5
5
Language ModelingWikiText-2 context length 4096 (test)
PPL (WikiText-2)6.36
5
Language ModelingWikiText-2 Top 5% most uncertain tokens (test)
NLL5.12
5
Language ModelingWikiText-2 Top 5% most uncertain tokens (val)
NLL5.13
5
Language ModelingWikiText-2 All tokens (test)
NLL1.66
5
Language ModelingWikiText-2 All tokens (val)
NLL1.7
5
Language ModelingWikiText-2 LLaMA-2 7B
PPL5.12
3
Character-level Language ModelingWikiText-2 (val)
PPL (Validation)3.49
3
Showing 19 of 19 rows