Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Unsafe Prompt Detection on ToxicChat
Loading...
75.5
AUPRC
GradSafe-Zero
47.628
54.864
62.1
69.336
Feb 21, 2024
AUPRC
Updated 4d ago
Evaluation Results
Method
Method
Links
AUPRC
GradSafe-Zero
Backbone=Llama-2-7b-ch...
2024.02
75.5
Llama Guard
Backbone=Llama-2 7b
2024.02
63.5
OpenAI Moderation API
2024.02
60.4
Perspective API
2024.02
48.7
Feedback
Search any
task
Search any
task