Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Detoxification on REALTOXICITYPROMPTS (test)
Loading...
0.081
Toxicity Score (Avg)
FINE-GRAINED RLHF
0.07656
0.10653
0.1365
0.16647
Jun 2, 2023
Toxicity Score (Avg)
PPL
Dist-2
Dist-3
Updated 4d ago
Evaluation Results
Method
Method
Links
Toxicity Score (Avg)
PPL
Dist-2
Dist-3
FINE-GRAINED RLHF
Reward Type=Sentence-l...
2023.06
0.081
9.77
94.9
93.2
Holistic RLHF
Reward Type=Holistic
2023.06
0.13
11.75
94.3
92.6
DEXPERTS
Approach=Controlled Ge...
2023.06
0.136
22.83
93.2
92.2
GeDi
Approach=Controlled Ge...
2023.06
0.154
24.78
93.8
93.8
GPT-2
Model Size=Large
2023.06
0.192
9.58
94.7
93.1
Feedback
Search any
task
Search any
task