| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Sentiment Classification | SST2 (test) | Accuracy96 | 214 | |
| Sentiment Classification | SST-2 | Accuracy95.99 | 174 | |
| Sentiment Analysis | SST-5 (test) | Accuracy62.27 | 173 | |
| Sentiment Analysis | SST-2 | Accuracy97.48 | 156 | |
| Text Classification | SST-2 | Accuracy97.09 | 129 | |
| Text Classification | SST-2 | Accuracy96 | 121 | |
| Classification | SST2 | Accuracy96.3 | 58 | |
| Text Classification | SST-5 (test) | Accuracy56.21 | 58 | |
| Sentiment Analysis | SST-5 | Accuracy94.84 | 47 | |
| Text Classification | SST-1 | Accuracy52.4 | 45 | |
| Sentiment Classification | SST (test) | Accuracy93.8 | 37 | |
| Text Reconstruction Attack | SST-2 | Total Runtime (hours)0.1 | 36 | |
| Text Classification | SST2 | Accuracy97.36 | 35 | |
| Text Classification | SST-2 | Harmful Score55.7 | 35 | |
| Sentiment Analysis | SST-2 | Accuracy96.71 | 33 | |
| Training Data Reconstruction | SST | ROUGE-11 | 32 | |
| Sentiment Classification | SST-5 | Accuracy70.67 | 31 | |
| Sentiment Analysis | SST-2 | ACC96 | 30 | |
| Text Classification | SST (val) | Top-1 Acc68.76 | 30 | |
| Text Classification | SST binary | Accuracy91.7 | 29 | |
| Faithfulness evaluation | SST2 | AUC π-Soft (NS)0.563 | 27 | |
| Text Clustering | SST-5 | Accuracy52.6 | 25 | |
| Text Clustering | SST-2 | Accuracy91.1 | 25 | |
| Explanation Faithfulness | SST-2 | Delta AF-0.675 | 24 | |
| Text Classification | SST-5 | Accuracy62.31 | 24 |