Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Jailbreak Attack Evaluation on HarmBench (400 random samples)

0ASR

Llama-2-7B-Chat (Original)

-3.7721.677547.12572.5725Feb 3, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
0
2026.02
2
2026.02
3.25
2026.02
11.5
2026.02
15.5
2026.02
16
2026.02
20
2026.02
25.5
2026.02
38.5
2026.02
71
2026.02
71
2026.02
77
2026.02
79.5
2026.02
81.25
2026.02
88.75
2026.02
90.25
2026.02
93.5
2026.02
94.25