Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Defending against Gradient-based Attacks on Llama3 AutoDAN Attack (test)
Loading...
10.57
ASR
Ours-Fakecom-t
8.2428
23.9514
39.66
55.3686
Nov 1, 2024
ASR
Updated 4d ago
Evaluation Results
Method
Method
Links
ASR
Ours-Fakecom-t
Defense=Fake Completio...
2024.11
10.57
Ours-Fakecom
Defense=Fake Completio...
2024.11
14.9
Ours-Ignore
Defense=Ignore Defense
2024.11
16.34
Spotlight
Defense=Spotlight defe...
2024.11
24.51
Ours-Escape
Defense=Escape Defense
2024.11
38.94
Sandwich
Defense=Sandwich defen...
2024.11
39.42
Reminder
Defense=Reminder defen...
2024.11
51.44
Instructional
Defense=Instructional...
2024.11
52.88
Isolation
Defense=Isolation defe...
2024.11
54.32
None
Defense=No defense bas...
2024.11
68.75
Feedback
Search any
task
Search any
task