Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Adversarial Prompt Detection on 10,000 Adversarial Prompts

83Detection Rate

Our Framework

63.2468.3773.578.63Mar 17, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
835
2026.03
807
2026.03
788
2026.03
6711
2026.03
649