Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Jailbreaking on AdvBench (Classifier-Specific ASR@1)
Loading...
0
ASR@1 (No Refusal)
ADV-LLM
-3.532
20.309
44.15
67.991
Feb 6, 2026
ASR@1 (No Refusal)
ASR@1 (LLM Classifier)
ASR@1 (HarmBench Classifier)
Updated 4d ago
Evaluation Results
Method
Method
Links
ASR@1 (No Refusal)
ASR@1 (LLM Classifier)
ASR@1 (HarmBench Classifier)
ADV-LLM
Victim Model=GPT-oss-2...
2026.02
0
0.4
0.8
Jigsaw Puzzle
Victim Model=GPT-oss-2...
2026.02
10
0.8
3.7
Jailbreak-R1
Victim Model=GPT-oss-2...
2026.02
13.9
2.9
9.8
FITD
Victim Model=GPT-oss-2...
2026.02
21.5
3.5
7.5
FlipAttack
Victim Model=GPT-oss-2...
2026.02
31
3.7
24.8
GOAT
Victim Model=GPT-oss-2...
2026.02
36.2
5.4
5.6
CoA
Victim Model=GPT-oss-2...
2026.02
42.1
1.9
6.4
X-Teaming
Victim Model=GPT-oss-2...
2026.02
45.6
15
30.2
Crescendo
Victim Model=GPT-oss-2...
2026.02
58.5
21.2
40.2
SEMA
Victim Model=GPT-oss-2...
2026.02
62.7
36
57.7
ActorAttack
Victim Model=GPT-oss-2...
2026.02
88.3
6.5
19.2
Feedback
Search any
task
Search any
task