| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Language Modeling | GPT-2 Pre-training (val) | Validation Loss2.493 | 21 | |
| Language Modeling | GPT-2 Evaluation Set | Hyper-Prior BPT29.91 | 20 | |
| Language Modeling | GPT-2 124M held-out (test) | Perplexity17.33 | 10 | |
| Machine-Generated Text Detection | GPT-2 (full) | Acc91.1 | 9 | |
| Language Modeling | GPT-2 Pretraining Data (train) | Training Loss2.9167 | 8 | |
| Data Extraction | GPT-2 (train) | Pearson r0.48 | 8 | |
| Concentration of target information | GPT-2 Small suite Aggregate (test) | Gini Coefficient0.71 | 6 | |
| Concentration of target information | GPT-2 Small 5 (test) | Gini Coefficient0.72 | 6 | |
| Concentration of target information | GPT-2 Small (test 4) | Gini Coefficient0.74 | 6 | |
| Concentration of target information | GPT-2 Small 3 (test) | Gini Coefficient0.73 | 6 | |
| Concentration of target information | GPT-2 Small (test) | Gini Coefficient0.71 | 6 | |
| Concentration of target information | GPT-2 Small (test 1) | Gini Coefficient0.33 | 6 | |
| Machine-generated text detection | GPT-2 (test) | Accuracy85.75 | 5 | |
| Language Modeling | GPT-2 1,000 samples | PPL27.99 | 4 | |
| Language Modeling | GPT-2 (val) | Base CE3.48 | 1 | |
| Efficiency Evaluation | GPT-2 124M | Inference Speed (x)1 | 1 | |
| Explanation Attribution | GPT-2 output-preserving n=30, k=2, overlap=0.7 (test) | Metric- | 0 |