| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| WikiText2 | Perplexity2.86 | 1,875 | 2d ago | ||
| WikiText-2 (test) | PPL2.56 | 1,541 | 2d ago | ||
| C4 | Llama2-7B | Perplexity4.77 | 1,182 | 3d ago | |
| WikiText-2 | TCN-SEQ-I | Perplexity (PPL)1.61 | 841 | 2d ago | |
| PTB | Perplexity8.159 | 650 | 3d ago | ||
| WikiText-103 (test) | RETRO | Perplexity2.22 | 524 | 2d ago | |
| WikiText | PPL2.92 | 479 | 2d ago | ||
| PTB (test) | Perplexity8.159 | 471 | 3d ago | ||
| Penn Treebank (test) | GL-LWGC-AWD-MOS-LSTM + dynamic evaluation | Perplexity46.34 | 411 | 3d ago | |
| C4 (val) | PPL5.709 | 392 | 3d ago | ||
| WikiText2 v1 (test) | Perplexity1.7 | 341 | 3d ago | ||
| C4 | Wanda | Perplexity1 | 321 | 2d ago | |
| WikiText2 (val) | TCN-SEQ-J | Perplexity (PPL)3.03 | 277 | 2d ago | |
| C4 (test) | Perplexity4.97 | 268 | 3d ago | ||
| Wiki | Wanda | Perplexity (PPL)2 | 251 | 2d ago | |
| LAMBADA | PaLM-2 L | Accuracy86.9 | 183 | 2d ago | |
| WikiText-103 (val) | PPL1.01 | 180 | 2d ago | ||
| Penn Treebank (val) | GL-LWGC-AWD-MOS-LSTM + dynamic evaluation | Perplexity46.64 | 178 | 3d ago | |
| FineWeb (val) | UMTAM | Validation Loss2.03 | 156 | 3d ago | |
| WikiText-103 | ESPACE | PPL4.59 | 146 | 3d ago | |
| ARXIV (test) | BRECT:FIXED:SKIP | PPL2.36 | 137 | 3d ago | |
| Penn Treebank (PTB) (test) | Perplexity14.72 | 120 | 2d ago | ||
| GitHub (test) | KERPLE-log | Perplexity2.42 | 113 | 3d ago | |
| One Billion Word Benchmark (test) | H-Transformer-1D | Test Perplexity20.25 | 108 | 3d ago | |
| PG-19 (test) | BST:SH:UNSTRUCT-L | Perplexity10.37 | 106 | 3d ago |