Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DAN

Benchmarks

Task NameDataset NameSOTA ResultTrend
Backdoor AttackDAN (Do-Anything-Now)
ASRw88.07
48
Safety EvaluationDAN
Safety Score (DAN)91
18
Harmful RefusalDAN
ASR54.2
7
Jailbreak DefenseDAN
Drop in ASR42.9
6
Jailbreak RobustnessDAN Static
ASR74.8
3
Showing 5 of 5 rows