Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AdvBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Jailbreak AttackAdvBench
AASR8,712
247
Safety EvaluationAdvbench
Safety Score100
117
Adversarial Attack Success RateAdvBench
ASR0
75
JailbreakAdvBench
Avg Queries2.1
63
Jailbreak DefenseAdvBench
ASR (Overall)0
49
Jailbreak AttackAdvBench 50
ASR (KW)100
48
Harmful Request DefenseAdvBench
ASR0
44
JailbreakingAdvBench
ASR99.2
44
Transferable Adversarial AttackAdvBench LLM Classifier (test)
TASR@19,260
39
Jailbreak DefenseAdvBench PAIR attack
DSR98
35
Safety evaluationAdvBench 50 examples
Safe Response Rate100
32
Jailbreak AttackAdvBench AdvSub
QSR100
30
JailbreakAdvBench Ensemble configuration GPT-4o
Attack Success Rate (ASR)0
25
Jailbreak AttackAdvBench GPT-3.5-turbo 1.0 (test)
Attack Success Rate97.12
22
Jailbreak AttackAdvBench (test)
ASR (HILL)98
22
Jailbreak AttackAdvbench Vicuna-33B Guard 100 prompts Original
ASR0
21
Jailbreak AttackAdvbench Llama2-70B Guard 100 prompts Original
ASR0
21
Adversarial and Jailbreaking Attack DetectionAdvBench
AUROC0.9675
20
Adversarial AttackAdvBench (query-specific)
MR37
20
Priming Attack RobustnessAdvBench No Attack (test)
ASR (GPT-4o)0
18
Jailbreak RobustnessAdvBench
PAIR ASR (GPT-4o)4
18
Jailbreak Attack Success RateAdvBench-x
ASR (English)7.94
18
Jailbreak AttackAdvBench Llama3.1-405B 1.0 (test)
ASR0
17
Jailbreak AttackAdvBench GPT-4o-mini 1.0 (test)
ASR81.35
17
Jailbreak AttackAdvBench GPT-4o 1.0 (test)
ASR92.67
17
Showing 25 of 73 rows