Forgetting is Competition: Rethinking Unlearning as Representation Interference in Diffusion Models
About
Deployed text-to-image diffusion models increasingly require post-hoc concept unlearning for copyright claims, artist opt-outs, safety updates, and protected-content mitigation without full retraining. A central challenge is erase-retain imbalance, aggressive updates suppress targets but damage shared capabilities, while conservative or anchor-based updates preserve quality yet leave concepts recoverable through related, compositional, paraphrased, or adversarial prompts. Inspired by retroactive interference, we propose SurgUn, which treats forgetting as controlled competition rather than direct deletion or one-to-one reassignment. SurgUn instantiates retroactive concept interference via distractor-conditioned gradient competition: target-gradient ascent weakens target-conditioned denoising or flow-matching behavior, while descent over a semantically diverse distractor set introduces competing non-target trajectories under the same prompt context. This redistributes outputs across multiple non-target modes instead of collapsing to a single proxy. To limit collateral forgetting through shared pathways, SurgUn adds pixel-grounded weight-space localization, a lightweight diagnostic that selects attention blocks by generated-image erase-retain behavior, exploiting the asymmetry that suppression is broadly achievable whereas retention is block-selective. Across UnlearnCanvas, IP-character erasure, Holistic Unlearning, EraseBench, and Ring-A-Bell on Stable Diffusion v1.5, SDXL, and SANA-1.5, SurgUn achieves a stronger erase-retain balance than baselines. Ablations show that diverse distractors, contrastive competition, and localization are all necessary for robust suppression while preserving related and unrelated concepts.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Style Unlearning | UnlearnCanvas | UA0.9779 | 36 | |
| Safety Unlearning Evaluation | Ring-A-Bell Nudity (test) | ASR9.22 | 21 | |
| Safety Unlearning Evaluation | Ring-A-Bell Violence (test) | ASR5.54 | 21 | |
| Generation Prevention | IP character | CLIPe0.17 | 16 | |
| Machine Unlearning | Sequential Unlearning Concepts (T1-T6) Stable Diffusion XL, SD v1.5, SANA | UA (T1)100 | 15 | |
| Object Unlearning | Object Unlearning | Unlearning Accuracy (UA)95.36 | 13 | |
| Concept Preservation | Related Concept Categories Church | Preservation Score96 | 9 | |
| Concept Preservation | Related Concept Categories Parachute | Preservation Score95.6 | 9 | |
| Concept Preservation | Related Concept Categories Gas pump | Preservation Score92.4 | 9 | |
| Concept Preservation | Related Concept Categories Average | Preservation Score90.5 | 9 |