Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Jailbreak Detection on FigStep (AUROC/AUPRC)
Loading...
0.9955
AUROC
Mahal-OOD
0.560572
0.673486
0.7864
0.899314
Dec 12, 2025
AUROC
AUPRC
Updated 4d ago
Evaluation Results
Method
Method
Links
AUROC
AUPRC
Mahal-OOD
Model=LLaVA, Uses targ...
2025.12
0.9955
0.979
Mahal-OOD*
Model=FLAVA, Uses targ...
2025.12
0.982
0.9253
KNN-OOD
Model=LLaVA, Uses targ...
2025.12
0.9809
0.916
JailDAM*
Model=CLIP, Uses targe...
2025.12
0.9608
0.9616
KNN-OOD*
Model=FLAVA, Uses targ...
2025.12
0.9081
0.668
LLaVaGuard
Model=Qwen
2025.12
0.836
0.7231
GradSafe
Model=LLaVA
2025.12
0.6804
0.237
VLGuard
Model=LLaVA
2025.12
0.6106
0.3817
HiddenDetect
Model=LLaVA
2025.12
0.5773
0.3238
Feedback
Search any
task
Search any
task