EraseAnything++: Enabling Concept Erasure in Rectified Flow Transformers Leveraging Multi-Object Optimization
About
Removing undesired concepts from large-scale text-to-image (T2I) and text-to-video (T2V) diffusion models while preserving overall generative quality remains a major challenge, particularly as modern models such as Stable Diffusion v3, Flux, and OpenSora employ flow-matching and transformer-based architectures and extend to long-horizon video generation. Existing concept erasure methods, designed for earlier T2I/T2V models, often fail to generalize to these paradigms. To address this issue, we propose EraseAnything++, a unified framework for concept erasure in both image and video diffusion models with flow-matching objectives. Central to our approach is formulating concept erasure as a constrained multi-objective optimization problem that explicitly balances concept removal with preservation of generative utility. To solve the resulting conflicting objectives, we introduce an efficient utility-preserving unlearning strategy based on implicit gradient surgery. Furthermore, by integrating LoRA-based parameter tuning with attention-level regularization, our method anchors erasure on key visual representations and propagates it consistently across spatial and temporal dimensions. In the video setting, we further enhance consistency through an anchor-and-propagate mechanism that initializes erasure on reference frames and enforces it throughout subsequent transformer layers, thereby mitigating temporal drift. Extensive experiments on both image and video benchmarks demonstrate that EraseAnything++ substantially outperforms prior methods in erasure effectiveness, generative fidelity, and temporal consistency, establishing a new state of the art for concept erasure in next-generation diffusion models.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text-to-Image Generation | MS-COCO | FID21.5 | 131 | |
| Nudity Erasure | I2P | Total Count182 | 38 | |
| Image Generation | MS-COCO 10k (test) | FID21.67 | 24 | |
| Concept Erasure Robustness | Unlearn DiffAtk | Nudity Rate68.8 | 9 | |
| Concept Erasure Robustness | Ring-a-Bell | Nudity Rate22.93 | 9 | |
| Concept Erasure Robustness | Ring-A-Bell Union | Nudity Rate30.27 | 9 | |
| Artistic Style Erasure | 200-artist dataset | ACCe20.71 | 9 | |
| Concept Erasure | Concept Erasure Abstraction | ACCe20.6 | 9 | |
| Concept Erasure | Concept Erasure Relationship | ACCe18.5 | 9 | |
| Concept Erasure | Concept Erasure Entity | ACCe12.4 | 9 |