| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Classification Probing | Sarcasm (test) | Probe Acc (Best Layer)96.3 | 21 | |
| Concept Attribution | Sarcasm | Avg Attribution F142 | 18 | |
| Sarcasm Detection | Sarcasm dataset (test) | Accuracy88.1 | 8 | |
| Abuse Detection | Sarcasm Dataset | AUC98 | 8 | |
| Concept Detection | Sarcasm (test) | F1 Score87 | 6 | |
| Concept vector stability | Sarcasm | Mean Absolute-Cosine Similarity0.99 | 6 | |
| Concept Detection | Sarcasm | F1 Score66.2 | 5 | |
| Sarcasm Detection | Sarcasm | F1 Macro53.57 | 3 |