Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OpenWebText

Benchmarks

Task NameDataset NameSOTA ResultTrend
Unconditional Text GenerationOpenWebText
Gen. PPL1.21
219
Text GenerationOpenWebText
Perplexity3.18
142
Language ModelingOpenWebText
Perplexity11
122
Language ModelingOpenWebText (val)
Validation Loss2.6091
114
Language ModelingOpenWebText2 (test)
Perplexity16.2
104
Text GenerationOpenWebText
Gen PPL11.312
54
Unconditional generationOpenWebText unconditional generation L = 1024
MAUVE100
50
Unconditional generationOpenWebText (OWT) L=1024 (held-out)
MAUVE1
45
Conditional generationOpenWebText
Generation Perplexity (Gen.PPL)24.11
42
Language ModelingOpenWebText (OWT) (val)
Perplexity7.77
42
Unconditional generationOpenWebText L=1024 (test)
Generation Perplexity14.1
40
Language ModelingOpenWebText (test)
Average Perplexity2.947
31
Unconditional Text GenerationOpenWebText (OWT) (test)
Generation Perplexity27.35
30
Next Token PredictionOpenWebText
PPL18.68
30
Next Token PredictionOpenWebText (held-out)
ID PPL18.53
30
ClusteringOpenWebText
Clustering Score0.6222
30
Sentiment SteeringOpenWebText Neutral to Negative (test)
Perplexity (PPL)12.48
27
Sentiment SteeringOpenWebText Neutral to Positive (test)
Perplexity (PPL)12.48
27
Language ModelingOpenWebText2
Perplexity (PPL)5.45
25
Language ModelingNanoGPT OpenWebText
Throughput (tokens/s)391,100
24
Unconditional Text GenerationOpenWebText (test)
LLAMA2 Score692.3
21
Language ModelingOpenWebText (train)
Train Loss2.5243
21
Embedding Space AnalysisOpenWebText
Iso0.98
18
Language ModelingOpenWebText GPT-2 (val)
Perplexity16.11
17
Language ModelingOpenWebText standard (test)
Perplexity20.08
17
Showing 25 of 73 rows