Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Unsafe Prompt Detection on ToxicChat

75.5AUPRC

GradSafe-Zero

47.62854.86462.169.336Feb 21, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.02
75.5
2024.02
63.5
60.4
48.7