Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Ring-a-bell

Benchmarks

Task NameDataset NameSOTA ResultTrend
Safety Unlearning EvaluationRing-A-Bell Violence (test)
ASR5.54
21
Safety Unlearning EvaluationRing-A-Bell Nudity (test)
ASR9.22
21
Concept UnlearningRing-A-Bell
Ring-A-Bell Score0.83
20
NSFW suppressionRing-A-Bell
ASR1.3
18
Adversarial Robustness in Concept ErasingRing-A-Bell K-16, K-38, K-77
K-16 Score0.9579
14
Safety EvaluationRing-A-Bell
Ring-16 Score4.93
13
Safe Text-to-Image GenerationRing-A-Bell
ASR83.1
13
Nudity Concept ErasureRing-a-bell Adversarial Prompts
Erase Rate (%)100
13
Concept Unlearning RobustnessRing-A-Bell adversarial prompts (K77)
ASR (Threshold 0.3)1.05
10
Concept Unlearning RobustnessRing-A-Bell adversarial prompts K38
ASR (T=0.3)1.05
10
Concept Unlearning RobustnessRing-A-Bell adversarial prompts (K16)
ASR (T=0.3)71.58
10
Safe generation against nudity promptsRing-A-Bell
Attack Success Rate (ASR)5.1
9
Concept Erasure RobustnessRing-A-Bell Union
Nudity Rate24.77
9
Concept Erasure RobustnessRing-A-Bell
Nudity Rate11.01
9
Implicit Concept ErasureRing-A-Bell
ASR39
9
Adversarial RobustnessRing-A-Bell
Unsafe Ratio27
8
Concept ErasureRing-A-Bell
ASR83.1
8
Video Nudity ErasureRing-A-Bell
Nudity Rate6.97
6
Concept ErasureRing-A-Bell (285 prompts)
Attack Success Rate (w/o ATTACK)83.15
5
Nudity revivalRing-A-Bell 101 prompts
Detected NSFW Image Count101
5
Concept ErasureRing-A-Bell (Ring77)
ASR95.8
5
Concept ErasureRing-A-Bell Ring38
ASR94.7
5
Concept ErasureRing-A-Bell Ring16
ASR97.9
5
Safe Image GenerationRing-A-Bell violence
DSR62.44
4
NSFW RemovalRing-a-Bell
Ring-a-Bell Score97.72
4
Showing 25 of 27 rows