Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Jailbreak Detection on L3J
Loading...
98.3
Accuracy
GradSafe
53.1432
64.8666
76.59
88.3134
Feb 14, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
GradSafe
2026.02
98.3
gpt-4o-mini
Version=2024-07-18
2026.02
98.14
gpt-5-mini
Version=2025-08-07
2026.02
96.98
gpt-4.1-mini
Version=2025-04-14
2026.02
96.81
AISA
Backbone=Llama2-13b-I
2026.02
96.76
AISA
Backbone=Llama3.1-8b-I
2026.02
96.62
AISA
Backbone=Qwen3-8b-I
2026.02
94.95
AISA
Backbone=GPT-OSS-20b-I
2026.02
94.04
AISA
Backbone=Mistral-7b-I
2026.02
91.7
Jailbreak-Classifier
2026.02
84.57
NemoGuard-JailbreakDetect
2026.02
78.48
SPDetector
2026.02
65.79
Llama-Prompt-Guard-2
Parameters=86M
2026.02
54.88
Feedback
Search any
task
Search any
task