Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OpenWebText

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingOpenWebText2 (test)
Perplexity16.2
104
Unconditional Text GenerationOpenWebText
Gen. PPL1.21
100
Language ModelingOpenWebText
Perplexity11
91
Text GenerationOpenWebText
Perplexity3.18
86
Language ModelingOpenWebText (val)
Validation Loss2.6091
80
Unconditional generationOpenWebText (OWT) L=1024 (held-out)
MAUVE1
45
Language ModelingOpenWebText (OWT) (val)
Perplexity7.77
42
Next Token PredictionOpenWebText
PPL18.68
30
Next Token PredictionOpenWebText (held-out)
ID PPL18.53
30
ClusteringOpenWebText
Clustering Score0.6222
30
Sentiment SteeringOpenWebText Neutral to Negative (test)
Perplexity (PPL)12.48
27
Sentiment SteeringOpenWebText Neutral to Positive (test)
Perplexity (PPL)12.48
27
Language ModelingNanoGPT OpenWebText
Throughput (tokens/s)391,100
24
Unconditional Text GenerationOpenWebText (test)
LLAMA2 Score692.3
21
Language ModelingOpenWebText (train)
Train Loss2.5243
21
Embedding Space AnalysisOpenWebText
Iso0.98
18
Language ModelingOpenWebText (test)
Loss2.65
18
Language ModelingOpenWebText standard (test)
Perplexity20.08
17
Language ModelingOpenWebText (held-out set)
PPL11.5
16
Language ModelingOpenWebText GPT-2 (test)
Perplexity17.94
13
Unconditional generationOpenWebText L=2048 (test)
Gen. PPL13.2
12
Unconditional generationOpenWebText L=1024 (test)
Generation Perplexity14.1
12
Language ModelingOpenWebText2 (val)
Perplexity17.12
12
Sentiment SteeringOpenWebText Negative prompts (test)
Positivity Score0.59
12
Text GenerationOpenWebText (OWT) GPT-2 tokenizer (val)
PPL15.36
12
Showing 25 of 52 rows