Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WikiText

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingWikiText-2 (test)
PPL2.56
1,949
Language ModelingWikiText
PPL0.2838
732
Language ModelingWikiText (test)
Perplexity5.49
62
Language ModelingWikiText (val)
Perplexity12.51
54
Language ModelingWikitext
Wikitext PPL12.85
45
Language ModelingWikiText (held-out)
Perplexity (PPL)9.8
25
Language ModelingWikiText-103
Throughput (tokens/s)159,000
21
Language ModelingWikiText v1 (test)
Perplexity13.33
18
Language ModelingWikiText (WT)
Relative PPL Change (%)31
16
Language ModelingWikiText
PPL Change (%)1.7
16
Language ModelingWikiText-2 vLLM harness (test)
Perplexity (PPL)8.87
12
Privacy MeasurementWikiText
Epsilon0
12
Open-ended Text GenerationWikitext (test)
Diversity (DIV)95
12
Prefilling ProfilingWikiText (test)
Time (s)38
10
Language ModelingWikitext zero-shot
Perplexity25.75
10
Language ModelingWikiText (test)
ROUGE Score64.14
8
Language ModelingWikiText 1K
Perplexity13.8
7
Model Compression TimeWikiText
Compression Time (s)196.34
6
Membership Inference AttackWikiText
TPR @ 0.1% FPR14
6
Knowledge EvaluationWikiText (eval)
BPB0.777
6
Autoregressive Language ModelingWikiText-103 (first 10M tokens)
Perplexity (PPL)90.5
5
Language ModelingWikiText byte-level
Wikitext PPL1.5798
5
Masked ReconstructionWikiText-103
PPL4.94
5
Text GenerationWikitext
Coherence: CD Better Rate88.7
4
Language ModelingWikitext zero-shot
Gap Closed (%)40.8
3
Showing 25 of 35 rows