Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Attack Success Rate Evaluation on HRL/LRL Safety Prompts English, Single Image v1

0ASR

Claude 3.5 Sonnet

-0.02880.16560.360.5544Apr 13, 2025
Updated 4d ago

Evaluation Results

MethodLinks
0
0.03
0.1
2025.04
0.13
0.23
0.72