Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SST2

Benchmarks

Task NameDataset NameSOTA ResultTrend
Sentiment AnalysisSST2
Accuracy94.9541
47
Image ClassificationSST2 Rendered
Top-1 Accuracy68.4
47
Counterfactual GenerationSST2 (test)
SLFR29
29
Sentiment AnalysisSST2
Spearman Rho (x100)93.32
23
Text embeddingSST2
T-Value24.69
20
Sentiment ClassificationSST2
Deletion Robustness0.2943
20
Sentiment AnalysisSST2
Accuracy94.04
20
Feature AttributionSST2
LO-0.199
18
Out-of-Distribution DetectionSST2 (test)
AUROC0.7327
17
Sentiment ClassificationSST2 phrase
Accuracy93.96
16
Text ClassificationSST2
Macro-F195.07
15
Graph ClassificationSST2 length OOD
Accuracy81.9
14
Sentiment AnalysisSST2 (test)
HS Score54.4
14
Graph ClassificationSST2 GraphOOD (test)
Accuracy83.52
13
Text ClassificationSST2
Accuracy93.42
10
Graph ClassificationGRAPH-SST2 (test)
Accuracy82.99
8
Sentiment AnalysisSST2 (test)
Accuracy90.64
7
ReasoningSST2
Accuracy (FA)95.9
7
Sentiment AnalysisSST2
CACC96.2
6
Sentiment ClassificationSST2
Accuracy95.87
6
ClassificationSST2 Dir alpha=0.1
Generalized Accuracy92.14
6
Sentiment AnalysisSST2 Dir alpha=0.1 Standard
Personalized Accuracy (Acc_p)95.9
6
Alignment defense against harmful fine-tuningSST2
Harmful Score (HS)11.3
5
Sentiment ClassificationSST2
HS Metric33.86
5
Binary ClassificationSST2 32-shot (test)
Accuracy75.5
5
Showing 25 of 34 rows