Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SST2

Benchmarks

Task NameDataset NameSOTA ResultTrend
Image ClassificationSST2 Rendered
Top-1 Accuracy68.4
47
Sentiment AnalysisSST2
Accuracy94.77
39
Counterfactual GenerationSST2 (test)
SLFR29
29
Sentiment AnalysisSST2
Spearman Rho (x100)93.32
23
Text embeddingSST2
T-Value24.69
20
Sentiment ClassificationSST2
Deletion Robustness0.2943
20
Sentiment AnalysisSST2
Accuracy94.04
20
Feature AttributionSST2
LO-0.199
18
Out-of-Distribution DetectionSST2 (test)
AUROC0.7327
17
Sentiment ClassificationSST2 phrase
Accuracy93.96
16
Text ClassificationSST2
Macro-F195.07
15
Graph ClassificationSST2 length OOD
Accuracy81.9
14
Sentiment AnalysisSST2 (test)
HS Score54.4
14
Graph ClassificationSST2 GraphOOD (test)
Accuracy83.52
13
Text ClassificationSST2
Accuracy93.42
10
Graph ClassificationGRAPH-SST2 (test)
Accuracy82.99
8
Sentiment ClassificationSST2
Accuracy95.87
6
ClassificationSST2 Dir alpha=0.1
Generalized Accuracy92.14
6
Sentiment AnalysisSST2 Dir alpha=0.1 Standard
Personalized Accuracy (Acc_p)95.9
6
Alignment defense against harmful fine-tuningSST2
Harmful Score (HS)11.3
5
Sentiment ClassificationSST2
HS Metric33.86
5
Binary ClassificationSST2 32-shot (test)
Accuracy75.5
5
Binary ClassificationSST2 16-shot (test)
Accuracy73.2
5
Binary ClassificationSST2 4-shot (test)
Accuracy0.698
5
WatermarkingSST2 (test)
ACC93.07
4
Showing 25 of 29 rows