Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Defending against Gradient-based Attacks on Llama3 AutoDAN Attack (test)

10.57ASR

Ours-Fakecom-t

8.242823.951439.6655.3686Nov 1, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.11
10.57
2024.11
14.9
2024.11
16.34
2024.11
24.51
2024.11
38.94
2024.11
39.42
2024.11
51.44
2024.11
52.88
2024.11
54.32
2024.11
68.75