| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Membership Inference Attack | Wikipedia | AUC0.9 | 52 | |
| Inductive dynamic link prediction | Wikipedia (inductive) | AUC-ROC0.9848 | 44 | |
| Dynamic Link Prediction | Wikipedia Inductive | AP98.59 | 44 | |
| Dynamic Graph Anomaly Detection | Wikipedia S2 | AUROC83.39 | 42 | |
| Response correctness and completeness evaluation | Wikipedia | F1 Score68 | 38 | |
| transductive dynamic link prediction | Wikipedia | AUC ROC99.31 | 37 | |
| Membership Inference Attack | Wikipedia Pythia | ROC AUC74 | 36 | |
| Membership Inference | Wikipedia Pythia (train) | TPR@1%FPR22.7 | 36 | |
| Reliability of post-edit LLMs | Wikipedia | BLEU100 | 36 | |
| Language Modeling | Wikipedia | Perplexity9.17 | 35 | |
| Dynamic link prediction | Wikipedia | AP99.03 | 27 | |
| Membership Inference Attack | Wikipedia en | AUC0.79 | 26 | |
| Document Classification | Wikipedia (test) | Classification Error30.24 | 24 | |
| Dynamic Link Prediction | Wikipedia | AUC-ROC0.8768 | 22 | |
| Link Prediction | Wikipedia (inductive) | AP99.04 | 21 | |
| Link Prediction | Wikipedia transductive | AP99.31 | 21 | |
| Fact Memorization | Wikipedia corpus annotated (train) | Fact Accuracy929.25 | 20 | |
| Language Modeling | Wikipedia 20k sentences | Perplexity (Wikipedia 20k)9.06 | 20 | |
| Unconditional Text Generation | Wikipedia | Mauve Score90.1 | 18 | |
| Graph Clustering | Wikipedia | NMI0.516 | 15 | |
| Machine-paraphrased plagiarism detection | Wikipedia SpinBot paraphrased (test) | F1-Micro89.55 | 15 | |
| Node classification | Wikipedia | AUC88.32 | 15 | |
| AI-generated text detection | Wikipedia OPT-13B generations (+ 60L,600) | Accuracy (1% FPR)97.2 | 14 | |
| Page Classification | Wikipedia (90% train ratio) | Macro-F1 Score83.66 | 13 | |
| Link prediction | Wikipedia | AUC99.2 | 12 |