| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Language Modeling | OpenWebText2 (test) | Perplexity16.2 | 104 | |
| Language Modeling | OpenWebText (val) | Validation Loss2.6091 | 70 | |
| Text Generation | OpenWebText | Perplexity132.55 | 66 | |
| Unconditional Text Generation | OpenWebText | Gen. PPL11.1 | 56 | |
| Language Modeling | OpenWebText | Perplexity11 | 50 | |
| Unconditional generation | OpenWebText (OWT) L=1024 (held-out) | MAUVE1 | 45 | |
| Sentiment Steering | OpenWebText Neutral to Negative (test) | Perplexity (PPL)12.48 | 27 | |
| Sentiment Steering | OpenWebText Neutral to Positive (test) | Perplexity (PPL)12.48 | 27 | |
| Unconditional Text Generation | OpenWebText (test) | LLAMA2 Score692.3 | 21 | |
| Embedding Space Analysis | OpenWebText | Iso0.98 | 18 | |
| Language Modeling | OpenWebText (test) | Loss2.65 | 18 | |
| Language Modeling | OpenWebText standard (test) | Perplexity20.08 | 17 | |
| Language Modeling | OpenWebText (held-out set) | PPL11.5 | 16 | |
| Language Modeling | OpenWebText GPT-2 (test) | Perplexity17.94 | 13 | |
| Language Modeling | OpenWebText (OWT) (val) | Perplexity17.5 | 12 | |
| Unconditional generation | OpenWebText L=2048 (test) | Gen. PPL13.2 | 12 | |
| Unconditional generation | OpenWebText L=1024 (test) | Generation Perplexity14.1 | 12 | |
| Language Modeling | OpenWebText2 (val) | Perplexity17.12 | 12 | |
| Text Generation | OpenWebText (OWT) GPT-2 tokenizer (val) | PPL15.36 | 12 | |
| Language Modeling | OpenWebText (train) | Train Loss2.5243 | 11 | |
| Language Modeling | OpenWebText GPT-2 124M (val) | LCE3.167 | 8 | |
| Text Generation | OpenWebText (test) | Average Perplexity3.77 | 8 | |
| Language generation | OpenWebText (val) | OLMo Perplexity14.2 | 8 | |
| Sentiment Steering | OpenWebText Positive prompts (test) | Negativity Score0.6 | 8 | |
| Sentiment Steering | OpenWebText Negative prompts (test) | Positivity Score0.59 | 8 |