| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Membership Inference Attack | Pile CC | TPR @ 1%1.7 | 61 | |
| Membership Inference Attack | Pile CC Pythia | ROC AUC71 | 36 | |
| Conditional Generation | Pile | Perplexity11.2 | 18 | |
| Language Modeling | MiniPile (val) | Validation Perplexity40.83 | 10 | |
| Language Modeling | MiniPile (train) | Training Perplexity30.87 | 10 | |
| Clutter removal | Pile single-view, random camera pose, Gaussian noise | GSR92 | 10 | |
| Language Modeling | Pile uncopyrighted (test) | Worst Log-Perplexity3.608 | 9 | |
| Language Modeling | Pile | Loss1.876 | 8 | |
| Clutter removal | Pile Real-world | GSR (%)79.3 | 7 | |
| Membership Inference | PILE | Loss (AUROC)50.9 | 7 | |
| Membership Inference Attack | PILE (train) | Loss8.2 | 7 | |
| Language Modeling | Pile (val) | Loss1.808 | 5 | |
| Language Modeling | Pile | BPB0.74 | 4 | |
| Language | Pile (test) | Accuracy59.4 | 3 | |
| Text reconstruction | pile | PPL1.65 | 3 | |
| Language Modeling | Pile Non-AR tokens | Perplexity33.95 | 3 | |
| Language Modeling | Pile AR tokens | Perplexity3.07 | 3 | |
| Language Modeling | pile (val) | Perplexity12.978 | 2 |