| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Language Modeling | WikiText-2 (test) | PPL2.56 | 1,541 | |
| Language Modeling | WikiText | PPL2.92 | 479 | |
| Language Modeling | WikiText (test) | Perplexity5.49 | 52 | |
| Language Modeling | WikiText (val) | Perplexity21.14 | 34 | |
| Language Modeling | WikiText (held-out) | Perplexity (PPL)9.8 | 25 | |
| Language Modeling | WikiText v1 (test) | Perplexity13.33 | 18 | |
| Privacy Measurement | WikiText | Epsilon0 | 12 | |
| Open-ended Text Generation | Wikitext (test) | Diversity (DIV)95 | 12 | |
| Language Modeling | WikiText | Perplexity (Baseline)9.91 | 11 | |
| Prefilling Profiling | WikiText (test) | Time (s)38 | 10 | |
| Language Modeling | Wikitext zero-shot | Perplexity25.75 | 10 | |
| Language Modeling | WikiText (test) | ROUGE Score64.14 | 8 | |
| Language Modeling | WikiText 1K | Perplexity13.8 | 7 | |
| Membership Inference Attack | WikiText | TPR @ 0.1% FPR14 | 6 | |
| Knowledge Evaluation | WikiText (eval) | BPB0.777 | 6 | |
| Masked Reconstruction | WikiText-103 | PPL4.94 | 5 | |
| Text Generation | Wikitext | Coherence: CD Better Rate88.7 | 4 | |
| Language Modeling | Wikitext | Accuracy28.75 | 3 | |
| Language Modeling | WikiText 50 (test) | Normalized Energy0.96 | 3 | |
| Membership Inference Attack | WikiText | AUC (LOSS)0.725 | 3 | |
| Open generation | WikiText-103 | Diversity75 | 3 | |
| Language Modeling | WikiText | Accuracy17.2826 | 2 | |
| Handwriting Text Recognition | Wikitext 2 column synthetic (test) | CER0.012 | 2 | |
| Handwriting Text Recognition | Wikitext 1 column synthetic (test) | CER0.008 | 2 | |
| Text Generation | WikiText (val) | Perplexity (PPL)25.75 | 1 |