Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WikiText-103

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingWikiText-103 (test)
Perplexity2.22
579
Language ModelingWikiText-103 (val)
PPL1.01
214
Language ModelingWikiText-103
PPL4.59
189
Word-level Language ModelingWikiText-103 word-level (test)
Perplexity15.79
65
Word-level language modelingWikiText-103 (dev)
Perplexity15.72
64
Language ModelingWikitext-103
PPL3.14
42
Text ContinuationWikiText-103 512-token continuation (test)
Perplexity (PPL)1
35
Language ModelingWikiText-103 zero-shot (test)
PPL12.76
34
Open-ended generationWikitext-103 (test)
MAUVE0.96
26
Language GenerationWikiText-103
Perplexity (PPL)1
25
TokenizationWikiText-103
Latency (ms)1.92
25
Language ModelingWikiText-103
Delta PPL0
25
Text generationWikitext-103
Perplexity32.88
23
Steganographic secret extractionWikiText-103 W (test)
Accuracy75
20
Text GenerationWikiText-103
Quality Better Count24
14
Membership Inference AttackWikiText-103
AUC0.784
14
Membership Inference AttackWikiText-103 (test)
AUC0.782
13
Open-ended text generationWikitext-103 v1
Diversity98.7
11
Language ModelingWikiText-103
Loss3.156
10
Open-ended Text GenerationWikitext-103
PPL2.55
10
Language ModelingWikiText-103 small setting (test)
Perplexity32.8
10
Language ModelingWikiText-103 small setting (val)
Perplexity31.8
10
Privacy-Preserving Text GenerationWikitext-103 v1
Cosine Similarity0.627
9
Autoregressive Language ModelingWikitext-103
PPL18.5
9
Data AttributionWikiText-103 (test)
Tail-patch Score7.88
9
Showing 25 of 53 rows