Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Unsafe Prompt Detection on XSTest (test)

87.8Precision

OpenAI Moderation API

49.42459.38769.3579.313Feb 21, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
87.84357.7
2024.02
87.89792.1
2024.02
85.69590
83.53347.3
2024.02
81.382.581.9
2024.02
67.37068.6
2024.02
50.99967.2