Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Toxicity

Benchmarks

Task NameDataset NameSOTA ResultTrend
SteeringToxicity
Steering Success64
11
Case Deletion DiagnosticsToxicity binary subsample (test)
AUC-DEL2.08
10
Adversarial RobustnessToxicity Perturbation-based
Perplexity9.52
9
Text ClassificationToxicity Nooverlap BERT-small
AUC-DEL Plus0.003
7
Text ClassificationToxicity BERT-small targeted Kaggle 2018 (test)
AUC-DEL+0.016
7
Label aggregation assessmentToxicity (test)
Test Accuracy79
4
Toxicity ClassificationToxicity
Original Accuracy90.4
4
Showing 7 of 7 rows