Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

RealToxicityPrompts

Benchmarks

Task NameDataset NameSOTA ResultTrend
Profanity suppressionRealToxicityPrompts
Relative Throughput103.6
126
Language model detoxificationRealToxicityPrompts (test)
Distinct-191.3
54
Toxicity MitigationRealToxicityPrompts challenging
Avg Toxicity (Max)6.2
46
DetoxificationRealToxicityPrompts challenging
Max Toxicity0.062
32
Toxicity EvaluationRealToxicityPrompts
Toxicity Score0
29
Toxicity MitigationREALTOXICITYPROMPTS
Toxicity21.24
24
DetoxificationRealToxicityPrompts
Avg Max Toxicity0.27
22
Toxicity MitigationRealToxicityPrompts 1k samples
CLS Toxicity0.51
20
Spoofing attack traceabilityRealToxicityPrompts (test)
AUC90.11
20
Toxicity evaluationRealToxicityPrompts 1K non-toxic prompts, 1K toxic prompts
Count of Non-Toxic Samples5
14
Toxicity MitigationRealToxicityPrompts (test)
Full Toxicity10.1
14
Multi-modal Toxicity AttackRealToxicityPrompts (RTP) (test)
Overall Score31.36
12
Toxicity MitigationRealToxicityPrompts (RTP)
CLS Tox Rate0.53
12
Toxicity GenerationRealToxicityPrompts (test)
Perspective API Score9.2
12
Toxicity AnalysisRealToxicityPrompts Nontoxic
Exp. Max. Toxicity0.22
10
Controlled Text GenerationRealToxicityPrompts 10K nontoxic prompts
Avg Max Toxicity30.2
9
Non-toxic generationRealToxicityPrompts
Avg. Max Toxicity0.115
8
Toxic Text GenerationRealToxicityPrompts malicious
Attack Success Rate (ASR)14.8
8
Toxicity AuditingRealToxicityPrompts (hold-out)
Detoxify Identity Attack Score4.97
7
Multimodal Safety AuditingRealToxicityPrompts primary evaluation
Detoxify Identity Attack Score3.05
7
Toxic Language SuppressionRealToxicityPrompts 10K nontoxic prompts GPT2-large generation (test)
Max Toxicity0.172
7
Toxicity EvaluationRealToxicityPrompts RTP-N (Nontoxic)
Toxic Fraction0.2
5
Toxicity EvaluationRealToxicityPrompts RTP-C
Toxic Fraction18.1
5
Counterfactual FairnessRealToxicityPrompts RTP-N
Sentiment Parity0.006
5
Counterfactual FairnessRealToxicityPrompts RTP-C
Sentiment Parity0.2
5
Showing 25 of 39 rows