Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Malicious Prompt Detection on In-the-wild Jailbreak Prompts

98.15Accuracy

Enhanced Filtering and Summarization System

-1.929224.052950.03576.0171May 2, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.05
98.15
2025.05
93.59
2025.05
6.62
2025.05
1.92