| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text Reconstruction Attack | CoLA | Total Runtime (hours)0.1 | 36 | |
| Training Data Reconstruction | CoLA | ROUGE-1100 | 32 | |
| Linguistic Acceptability | COLA | Accuracy83.43 | 31 | |
| Text reconstruction from gradients | CoLA | ROUGE-1100 | 24 | |
| Acceptability Classification | CoLA out-of-domain (dev test) | Accuracy84.9 | 24 | |
| Grammatical Acceptability | CoLA (held-out) | F1 Score46.8 | 14 | |
| Text-to-Image retrieval | Cola (test) | Multi-Obj Acc83.88 | 14 | |
| Dependency tree recovery | CoLA max length 14 (val) | ARR75.7 | 14 | |
| Acceptability Classification | CoLA English (IDD) | Accuracy88.2 | 12 | |
| Acceptability Classification | CoLA in-domain (dev) | Accuracy88.6 | 12 | |
| Natural Language Inference | CoLA | Accuracy86.26 | 10 | |
| Compositional Reasoning | Cola | Txt2Img Score33.33 | 10 | |
| Linguistic Acceptability | CoLA GLUE (val) | Accuracy84.76 | 9 | |
| Sentence Classification | CoLA full (test) | Accuracy86.4 | 9 | |
| Linguistic Acceptability | CoLA (dev) | Matthews Correlation68.5 | 8 | |
| Syntax | CoLA | MCC59.71 | 8 | |
| Text Classification | CoLA (test) | MCC32.05 | 8 | |
| Classification | COLA | ASR Score0.175 | 8 | |
| Text-to-image retrieval | Cola | R@1 (Txt2Img)22.7 | 8 | |
| Linguistic Acceptability | CoLA (test) | Avg Accuracy86.34 | 8 | |
| Linguistic Acceptability | CoLA (OOD) | MCC23.06 | 6 | |
| Linguistic Acceptability | CoLA | Max Memory (MB)3,080 | 5 | |
| Compositional Evaluation | CoLA Txt2Img | Compositional Score (Txt2Img)21 | 4 | |
| Linguistic Acceptability Classification | CoLA out-of-domain 2018 (test) | MCC0.616 | 4 | |
| Ranking correlation with full dataset evaluation | COLA | Kendall Correlation0.63 | 4 |