Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

JailbreakV

Benchmarks

Task NameDataset NameSOTA ResultTrend
Inference LatencyJailbreakV
Latency (s)2.81
25
Text-Based Jailbreak AttackJailbreakV-28K (test)
ASR (None-Template)75.23
25
Safety EvaluationJailbreakV-28K v1 (test)
ASR (Noise-T)6.63
18
Safety EvaluationJailBreakV
ASR11.87
15
Jailbreak DetectionJailbreakV
AUROC99.69
9
Jailbreak DefenseJailbreakV-28K
ASR (Noise, T)8.4
6
Jailbreak Attack DefenseJailbreakV-28K v1 (test)
Defense Success Rate (Noise - T)38.16
6
Safety EvaluationJailBreakV (test)
HPR23
3
Showing 8 of 8 rows