| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Language Modeling | PG-19 (test) | Perplexity10.37 | 106 | |
| Language Modeling | PG-19 500M parameters scale (test) | PPLX40.72 | 20 | |
| Language modeling | PG-19 (val) | Perplexity18.43 | 19 | |
| Online Language Modeling | PG-19 (Whole Book) | PPL @ 50K18.87 | 17 | |
| Long-Context Generation | PG-19 60K context length | Throughput Speedup (micro)6.29 | 6 | |
| Long-Context Generation | PG-19 50K context length | Throughput Speedup (micro)5.79 | 6 | |
| Long-Context Generation | PG-19 40K context length | Throughput Speedup (micro)5.46 | 6 | |
| Long-Context Generation | PG-19 30K context length | Throughput Speedup (micro)4.75 | 6 | |
| Language Modeling | PG-19 (dev) | Perplexity52.08 | 6 | |
| Compression Capacity | PG-19 (test) | Max Tokens1,568 | 6 |