Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Concept Unlearning Robustness on Ring-A-Bell adversarial prompts K38

1.05ASR (T=0.3)

ACE

-1.813217.513436.8456.1666Mar 19, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
1.051.0500
2026.03
3.16000
2026.03
32.6324.219.474.21
2026.03
44.2134.7418.956.32
2026.03
45.2636.8417.893.16
2026.03
51.584018.954.21
2026.03
52.6336.8423.168.42
2026.03
52.6341.05204.21
2026.03
55.7948.4231.5813.68
2026.03
72.6366.3246.3217.89