Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Forgetting is Competition: Rethinking Unlearning as Representation Interference in Diffusion Models

About

Deployed text-to-image diffusion models increasingly require post-hoc concept unlearning for copyright claims, artist opt-outs, safety updates, and protected-content mitigation without full retraining. A central challenge is erase-retain imbalance, aggressive updates suppress targets but damage shared capabilities, while conservative or anchor-based updates preserve quality yet leave concepts recoverable through related, compositional, paraphrased, or adversarial prompts. Inspired by retroactive interference, we propose SurgUn, which treats forgetting as controlled competition rather than direct deletion or one-to-one reassignment. SurgUn instantiates retroactive concept interference via distractor-conditioned gradient competition: target-gradient ascent weakens target-conditioned denoising or flow-matching behavior, while descent over a semantically diverse distractor set introduces competing non-target trajectories under the same prompt context. This redistributes outputs across multiple non-target modes instead of collapsing to a single proxy. To limit collateral forgetting through shared pathways, SurgUn adds pixel-grounded weight-space localization, a lightweight diagnostic that selects attention blocks by generated-image erase-retain behavior, exploiting the asymmetry that suppression is broadly achievable whereas retention is block-selective. Across UnlearnCanvas, IP-character erasure, Holistic Unlearning, EraseBench, and Ring-A-Bell on Stable Diffusion v1.5, SDXL, and SANA-1.5, SurgUn achieves a stronger erase-retain balance than baselines. Ablations show that diverse distractors, contrastive competition, and localization are all necessary for robust suppression while preserving related and unrelated concepts.

Ashutosh Ranjan, Vivek Srivastava, Shirish Karande, Murari Mandal• 2026

Related benchmarks

TaskDatasetResultRank
Style UnlearningUnlearnCanvas
UA0.9779
36
Safety Unlearning EvaluationRing-A-Bell Nudity (test)
ASR9.22
21
Safety Unlearning EvaluationRing-A-Bell Violence (test)
ASR5.54
21
Generation PreventionIP character
CLIPe0.17
16
Machine UnlearningSequential Unlearning Concepts (T1-T6) Stable Diffusion XL, SD v1.5, SANA
UA (T1)100
15
Object UnlearningObject Unlearning
Unlearning Accuracy (UA)95.36
13
Concept PreservationRelated Concept Categories Church
Preservation Score96
9
Concept PreservationRelated Concept Categories Parachute
Preservation Score95.6
9
Concept PreservationRelated Concept Categories Gas pump
Preservation Score92.4
9
Concept PreservationRelated Concept Categories Average
Preservation Score90.5
9
Showing 10 of 15 rows

Other info

Follow for update