Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Toxicity Mitigation on Real Toxicity Prompts (Challenging Subset)

0.189Avg Max Toxicity

M+

0.16160.346550.53150.71645Nov 11, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.11
0.18912.512.550.185
2025.11
0.22912.510.850.111
2025.11
0.2716.710.60.104
2025.11
0.3935.89.540.1
2025.11
0.47246.79.420.076
2025.11
0.47445.814.40.109
2025.11
0.72189.111.430.062
2025.11
0.78290.310.420.032
2025.11
0.8183.112.90.082
2025.11
0.85798.38.040.118
2025.11
0.87396.79.090.118
2025.11
0.87499.28.670.09