Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Jailbreak Detection on MM-SafetyBench
Loading...
99.18
AUROC
Mahal-OOD
59.5352
69.8276
80.12
90.4124
Dec 12, 2025
AUROC
AUPRC
Updated 4d ago
Evaluation Results
Method
Method
Links
AUROC
AUPRC
Mahal-OOD
Model=LLaVA, Uses targ...
2025.12
99.18
99.32
Mahal-OOD*
Model=FLAVA, Uses targ...
2025.12
97.01
97.62
KNN-OOD
Model=LLaVA, Uses targ...
2025.12
95.23
95.91
JailDAM*
Model=CLIP, Uses targe...
2025.12
91.26
98.04
KNN-OOD*
Model=FLAVA, Uses targ...
2025.12
85.47
90.65
GradSafe
Model=LLaVA
2025.12
85.14
87.52
HiddenDetect
Model=LLaVA
2025.12
82.69
93.53
LLaVaGuard
Model=Qwen
2025.12
74.27
87.29
VLGuard
Model=LLaVA
2025.12
61.06
80.2
Feedback
Search any
task
Search any
task