Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

RealToxicityPrompts

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language model detoxificationRealToxicityPrompts (test)
Distinct-191.3
54
Toxicity MitigationRealToxicityPrompts challenging
Avg Toxicity (Max)6.2
46
DetoxificationRealToxicityPrompts challenging
Max Toxicity0.062
32
Toxicity EvaluationRealToxicityPrompts
Toxicity Score0
29
Toxicity MitigationREALTOXICITYPROMPTS
Toxicity21.24
24
DetoxificationRealToxicityPrompts
Avg Max Toxicity0.27
22
Spoofing attack traceabilityRealToxicityPrompts (test)
AUC90.11
20
Toxicity evaluationRealToxicityPrompts 1K non-toxic prompts, 1K toxic prompts
Count of Non-Toxic Samples5
14
Toxicity MitigationRealToxicityPrompts (test)
Full Toxicity10.1
14
Toxicity GenerationRealToxicityPrompts (test)
Perspective API Score9.2
12
Toxicity AnalysisRealToxicityPrompts Nontoxic
Exp. Max. Toxicity0.22
10
Controlled Text GenerationRealToxicityPrompts 10K nontoxic prompts
Avg Max Toxicity30.2
9
Toxic Language SuppressionRealToxicityPrompts 10K nontoxic prompts GPT2-large generation (test)
Max Toxicity0.172
7
DetoxificationREALTOXICITYPROMPTS (test)
Toxicity Score (Avg)0.081
5
Toxic Output MitigationRealToxicityPrompts 1.0 (Toxic)
Toxicity0.299
5
Toxic Output MitigationRealToxicityPrompts 1.0 (Random)
Toxicity0.122
5
Open-ended generationRealToxicityPrompts Non-toxic prompts (test)
Toxicity Probability7.38
4
Toxicity avoidanceRealToxicityPrompts
Avg Max Toxicity Score0.265
4
Toxicity GenerationRealToxicityPrompts 100k prompts
Toxicity Score (Basic)10.4
4
Safety EvaluationRealToxicityPrompts (test)
Safety Score96
3
Toxicity EvaluationRealToxicityPrompts responses
Classifier Score21
3
Open-ended generationRealToxicityPrompts Toxic prompts (test)
Toxicity Probability74.29
2
Text-to-image generationREALTOXICITYPROMPTS
Inappropriate Probability10
2
Showing 23 of 23 rows