| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Language Modeling | WikiText-103 (test) | Perplexity2.22 | 703 | |
| Language Modeling | WikiText-103 (val) | PPL1.01 | 261 | |
| Language Modeling | WikiText-103 | PPL4.59 | 216 | |
| Word-level Language Modeling | WikiText-103 word-level (test) | Perplexity15.79 | 65 | |
| Word-level language modeling | WikiText-103 (dev) | Perplexity15.72 | 64 | |
| Language Modeling | WikiText-103 v1 (test) | Perplexity10.48 | 56 | |
| Text Continuation | WikiText-103 512-token continuation (test) | Perplexity (PPL)1 | 47 | |
| Language Modeling | WikiText-103 | Perplexity (PPL)5.47 | 43 | |
| Language Modeling | Wikitext-103 | PPL3.14 | 42 | |
| Language Generation | WikiText-103 | Perplexity (PPL)1 | 41 | |
| Language Modeling | WikiText-103 zero-shot (test) | PPL12.76 | 34 | |
| Language Modeling | Wikitext-103 | Perplexity (PPL)14.8 | 28 | |
| Open-ended generation | Wikitext-103 (test) | MAUVE0.96 | 26 | |
| Tokenization | WikiText-103 | Latency (ms)1.92 | 25 | |
| Language Modeling | WikiText-103 | Delta PPL0 | 25 | |
| Text generation | Wikitext-103 | Perplexity32.88 | 23 | |
| Steganographic secret extraction | WikiText-103 W (test) | Accuracy75 | 20 | |
| Language Modeling | WikiText-103 small setting (test) | Perplexity32.8 | 20 | |
| Language Modeling | WikiText-103 small setting (val) | Perplexity31.8 | 20 | |
| Language Modeling | WikiText-103 (train) | PPL15.18 | 19 | |
| Language Modeling | WikiText-103 | Mauve87 | 18 | |
| Language Modeling | WikiText-103 | Base Score173.704 | 18 | |
| Language Modeling | WikiText-103 | Perplexity15.33 | 17 | |
| Language Modeling | WikiText-103 | Perplexity (PPL)72 | 15 | |
| Text Generation | WikiText-103 | ROUGE-10.391 | 15 |