Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SST-2

Benchmarks

Task NameDataset NameSOTA ResultTrend
Text ClassificationSST-2 (test)
Accuracy98
185
Sentiment AnalysisSST-2 (test)
Accuracy97.1
136
Sentiment ClassificationSST-2 64 instances (test)
Accuracy92.55
80
Backdoor DefenseSST-2
CACC91.71
65
InterpretationSST-2
L2 Norm0.0434
56
Sentiment AnalysisSST-2 (test)
Clean Accuracy96.43
50
Sentiment AnalysisSST-2 GLUE
F1 Score94.9
45
Sentiment AnalysisSST-2 (dev)
Accuracy96.8
41
Sentiment AnalysisSST-2
Accuracy96.9
31
Text ClassificationSST-2
Accuracy93.62
24
Faithfulness EvaluationSST-2 (test)
Rate of Label Changes5.5
24
Sentiment ClassificationSST-2
Delta Accuracy0.05
24
Sentiment AnalysisSST-2 (test)
Avg Accuracy86.7
24
Sentiment AnalysisSST-2
CACC96.7
20
Text ClassificationSST-2
CA95.06
20
Text ClassificationSST-2 (test)
Delta CACC1.57
18
Backdoor Trigger DetectionSST-2
AU-ROC98.77
16
Sentiment AnalysisSST-2 (test)
CACC (Badnet)95.55
15
Sentiment AnalysisSST-2 (held-out)
F1 Score39.8
14
Text ClusteringSST-2 (test)
Accuracy90.2
14
Explanation EvaluationSST-2 (test)
Sufficiency17.69
14
Sentiment AnalysisSST-2 (test)
Top-1 Accuracy92.66
12
Backdoor PurificationSST-2
CACC89.84
12
Sentiment AnalysisSST-2 (test)
Attack Success Rate100
12
Sentiment AnalysisSST-2 original (test)
Accuracy95.9
11
Showing 25 of 65 rows