Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Jailbreak Detection on JailBreakBench Single Turn 35
Loading...
98
F1 Score
DeepContext
-3.92
22.54
49
75.46
Feb 18, 2026
F1 Score
Recall
Precision
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
Recall
Precision
DeepContext
2026.02
98
100
95
Qwen3Guard-Gen
Parameters=8B
2026.02
88
95
83
Llama-Guard-4
Parameters=12B
2026.02
86
86
86
Gpt5
Size=Nano
2026.02
83
91
76
GCP Model Armor
2026.02
83
96
74
Granite-Guardian-3.3
Parameters=8B
2026.02
78
100
65
Llama-Prompt-Guard-2
Parameters=86M
2026.02
59
50
72
Deberta-v3-Prompt-Injection
2026.02
54
57
50
AWS Prompt Attack Guardrails
2026.02
8
4
100
Azure Prompt Shield
2026.02
0
0
0
Feedback
Search any
task
Search any
task