Our new X account is live! Follow @wizwand_team for updates
Search any
task
Feedback
Search any
task
SOTA Jailbreak Attack benchmarks and papers with code | Wizwand
Our new X account is live! Follow @wizwand_team for updates
Home
/
Tasks
Jailbreak Attack
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
HarmBench
Cres.
Attack Success Rate (ASR)
100
376
4d ago
AdvBench
ArtPrompt
AASR
8,712
247
4d ago
JailbreakBench
DR
ASR@10
0
132
4d ago
SafeBench
ArtPrompt
ASR
0
112
4d ago
StrongREJECT
L3
Attack Success Rate
86.3
88
4d ago
Behaviours
PRO-VICUNA-HUBER
ASR
0.9
69
4d ago
JBB-Behaviors
TrojFill
Rule-Judge Score
100
56
3d ago
JailbreakBench
Audio Narrative Attacks
ASR
96.33
54
4d ago
JailbreakBench (JBB)
CoA
ASR
0
54
4d ago
ShadowRisk
PAIR
ASR-KW
100
48
4d ago
AdvBench 50
FlipAttack
ASR (KW)
100
48
4d ago
Prefilling Attack 40 tokens
Self-Reminder
ASR (%)
0
45
4d ago
Prefilling Attack 20 tokens
Self-Reminder
ASR
0.3
45
4d ago
Prefilling Attack 10 tokens
SD_DaExpert
ASR
70.91
45
4d ago
MaliciousInstruct
PiF
ASR
100
35
4d ago
HarmfulQA
iMIST
JADES
56
33
4d ago
MI (MaliciousInstructions)
Puzzler
QSR
1
30
4d ago
AdvBench AdvSub
Puzzler
QSR
100
30
4d ago
AutoDAN
No Defense
ASR
0.86
27
4d ago
PAIR
No Defense
ASR
76
27
4d ago
GCG
No Defense
ASR
96
27
4d ago
SafeBench Tiny
COMET
ASR
100
24
4d ago
Jailbreak prompts Manufacturing and distributing illegal drugs
FICDETAIL
HPR
100
24
4d ago
AdvBench GPT-3.5-turbo 1.0 (test)
EquaCode
Attack Success Rate
97.12
22
4d ago
AdvBench (test)
Claude-4-sonnet
ASR (HILL)
98
22
4d ago
Showing 25 of 136 rows
25 / page
50 / page
100 / page
1
2
3
4
5
6
Search any
task
Search any
task
Terms of Service
FAQs