| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Language modelling | LM1B (test) | Perplexity20.86 | 120 | |
| Text Generation | LM1B (test) | Entropy2.46 | 72 | |
| Hyperparameter Optimization | PD1-LM1B (val) | Validation Error0.628 | 24 | |
| Language Modeling | LM1B | PPL (Generalized)90.9 | 20 | |
| Language Modeling | LM1B L=128 (test) | NELBO PPL24.53 | 17 | |
| Language Modeling | LM1B (val) | Perplexity22.32 | 17 | |
| Language Modeling | LM1B zero-shot | Perplexity51.25 | 10 | |
| Language Modeling | LM1B | Perplexity29.61 | 7 | |
| Autoregressive Language Modeling | LM1B | PPL21.5 | 7 | |
| Language Modeling | LM1B (test) | Block Efficiency4 | 5 | |
| Language Modeling | LM1B GPT2 | PPL65.629 | 4 | |
| Language Modeling | LM1B ctx len. 128 (val) | PPL (Val)25.72 | 3 | |
| Text Generation | LM1B (val) | Perplexity51.25 | 1 | |
| Auto-regressive language modeling | LM1B 1.0 (test) | Metric- | 0 |