| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Unconditional Text Generation | OpenWebText | Gen. PPL1.21 | 219 | |
| Text Generation | OpenWebText | Perplexity3.18 | 142 | |
| Language Modeling | OpenWebText | Perplexity11 | 122 | |
| Language Modeling | OpenWebText (val) | Validation Loss2.6091 | 114 | |
| Language Modeling | OpenWebText2 (test) | Perplexity16.2 | 104 | |
| Text Generation | OpenWebText | Gen PPL11.312 | 54 | |
| Unconditional generation | OpenWebText unconditional generation L = 1024 | MAUVE100 | 50 | |
| Unconditional generation | OpenWebText (OWT) L=1024 (held-out) | MAUVE1 | 45 | |
| Conditional generation | OpenWebText | Generation Perplexity (Gen.PPL)24.11 | 42 | |
| Language Modeling | OpenWebText (OWT) (val) | Perplexity7.77 | 42 | |
| Unconditional generation | OpenWebText L=1024 (test) | Generation Perplexity14.1 | 40 | |
| Language Modeling | OpenWebText (test) | Average Perplexity2.947 | 31 | |
| Unconditional Text Generation | OpenWebText (OWT) (test) | Generation Perplexity27.35 | 30 | |
| Next Token Prediction | OpenWebText | PPL18.68 | 30 | |
| Next Token Prediction | OpenWebText (held-out) | ID PPL18.53 | 30 | |
| Clustering | OpenWebText | Clustering Score0.6222 | 30 | |
| Sentiment Steering | OpenWebText Neutral to Negative (test) | Perplexity (PPL)12.48 | 27 | |
| Sentiment Steering | OpenWebText Neutral to Positive (test) | Perplexity (PPL)12.48 | 27 | |
| Language Modeling | OpenWebText2 | Perplexity (PPL)5.45 | 25 | |
| Language Modeling | NanoGPT OpenWebText | Throughput (tokens/s)391,100 | 24 | |
| Unconditional Text Generation | OpenWebText (test) | LLAMA2 Score692.3 | 21 | |
| Language Modeling | OpenWebText (train) | Train Loss2.5243 | 21 | |
| Embedding Space Analysis | OpenWebText | Iso0.98 | 18 | |
| Language Modeling | OpenWebText GPT-2 (val) | Perplexity16.11 | 17 | |
| Language Modeling | OpenWebText standard (test) | Perplexity20.08 | 17 |