Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ToxiGen

Benchmarks

Task NameDataset NameSOTA ResultTrend
Safety EvaluationToxiGen
Safety100
77
Toxicity DetectionToxiGen
Score84.23
53
Toxicity GenerationToxiGen
ToxiGen Score1,633
24
Toxicity ClassificationToxigen
Accuracy60.41
22
HarmlessnessToxigen
Toxigen (%)100
17
DetoxificationToxiGen (test)
MTV97.4
16
Influence EstimationToxiGen (test)
Spearman Correlation0.44
14
Machine UnlearningToxiGen (test)
Accuracy ($D_f$)86.9
13
Machine UnlearningToxiGen (train)
Accuracy ($D_f$)85.06
13
Bias DetectionToxigen (test)
Accuracy90.3
12
Safety EvaluationToxiGen Pretrained Evaluation
Toxicity Rate14.53
12
Toxicity DetectionTOXIGEN (val)
AUC96
8
Misuse DetectionToxiGen Homophobia (external)
TPR98
1
Misuse DetectionToxiGen Ethnoracial (external)
TPR91
1
Detoxification Dataset Quality EvaluationToxiGen 500 neutral-toxic pairs
Overall O.2.475
1
Showing 15 of 15 rows