Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Concept Unlearning Robustness on Ring-A-Bell adversarial prompts (K16)

71.58ASR (T=0.3)

ESD

-2.863216.463435.7955.1166Mar 19, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
71.5861.0546.3218.95
2026.03
56.8446.3232.6314.74
2026.03
51.5838.9525.2610.53
2026.03
51.584016.844.21
2026.03
49.4741.0515.794.21
2026.03
4029.47205.26
2026.03
35.7932.6317.895.26
2026.03
27.3722.1114.743.16
2026.03
3.16000
2026.03
0000