Share your thoughts, 1 month free Claude Pro on usSee more

Adversarial Prompt Detection on 10,000 Adversarial Prompts

83Detection Rate

Our Framework

Updated 4mo ago

Evaluation Results

Method	Links
Our Framework 2026.03		83	5
SecurityLingua 2026.03		80	7
PromptShield 2026.03		78	8
Commercial Moderation APIs 2026.03		67	11
Open-Source Detoxify 2026.03		64	9