Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Unsafe Prompt Detection on XSTest (test)

87.8Precision

OpenAI Moderation API

49.42459.38769.3579.313Feb 21, 2024
Updated 4d ago

Evaluation Results

MethodLinks
87.84357.7
2024.02
87.89792.1
2024.02
85.69590
83.53347.3
2024.02
81.382.581.9
2024.02
67.37068.6
2024.02
50.99967.2