Share your thoughts, 1 month free Claude Pro on usSee more

Unsafe Prompt Detection on XSTest (test)

87.8Precision

OpenAI Moderation API

Updated 4mo ago

Evaluation Results

Method	Links
OpenAI Moderation API 2024.02		87.8	43	57.7
GPT-4 2024.02		87.8	97	92.1
GradSafe-Zero 2024.02		85.6	95	90
Perspective API 2024.02		83.5	33	47.3
Llama Guard 2024.02		81.3	82.5	81.9
Azure API 2024.02		67.3	70	68.6
Llama-2 2024.02		50.9	99	67.2