Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Jailbreak Detection on WJB
Loading...
93.26
ACC
gpt-5-mini
44.4528
57.1239
69.795
82.4661
Feb 14, 2026
ACC
Updated 4d ago
Evaluation Results
Method
Method
Links
ACC
gpt-5-mini
Version=2025-08-07
2026.02
93.26
AISA
Backbone=Llama3.1-8b-I
2026.02
87.29
gpt-4o-mini
Version=2024-07-18
2026.02
86.65
AISA
Backbone=Llama2-13b-I
2026.02
85.43
AISA
Backbone=Qwen3-8b-I
2026.02
83.03
AISA
Backbone=GPT-OSS-20b-I
2026.02
81.49
Jailbreak-Classifier
2026.02
81.45
AISA
Backbone=Mistral-7b-I
2026.02
79.23
gpt-4.1-mini
Version=2025-04-14
2026.02
79.19
GradSafe
2026.02
69.46
NemoGuard-JailbreakDetect
2026.02
68.51
SPDetector
2026.02
52.17
Llama-Prompt-Guard-2
Parameters=86M
2026.02
46.33
Feedback
Search any
task
Search any
task