Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Jailbreak Evaluation on AdvBench Ensemble configuration Claude-v2

1.08Harmfulness Score (HS)

ArtPrompt

0.94161.87582.813.7442Dec 19, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.12
1.080
2024.12
1.446
2024.12
1.68
2024.12
1.668
2024.12
2.2220
2024.12
3.3638
2024.12
3.3856
2024.12
3.4452
2024.12
3.6440
2024.12
3.866
2024.12
3.8668
2024.12
4.0664
2024.12
4.2872
2024.12
4.4884
2024.12
4.5486