Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Safety-Guided Flow (SGF): A Unified Framework for Negative Guidance in Safe Generation

About

Safety mechanisms for diffusion and flow models have recently been developed along two distinct paths. In robot planning, control barrier functions are employed to guide generative trajectories away from obstacles at every denoising step by explicitly imposing geometric constraints. In parallel, recent data-driven, negative guidance approaches have been shown to suppress harmful content and promote diversity in generated samples. However, they rely on heuristics without clearly stating when safety guidance is actually necessary. In this paper, we first introduce a unified probabilistic framework using a Maximum Mean Discrepancy (MMD) potential for image generation tasks that recasts both Shielded Diffusion and Safe Denoiser as instances of our energy-based negative guidance against unsafe data samples. Furthermore, we leverage control-barrier functions analysis to justify the existence of a critical time window in which negative guidance must be strong; outside of this window, the guidance should decay to zero to ensure safe and high-quality generation. We evaluate our unified framework on several realistic safe generation scenarios, confirming that negative guidance should be applied in the early stages of the denoising process for successful safe generation.

Mingyu Kim, Young-Heon Kim, Mijung Park• 2026

Related benchmarks

TaskDatasetResultRank
Text-to-Image GenerationCOCO 30k
FID23.73
53
Safe generation against nudity promptsMMA-Diffusion
ASR29.7
19
NSFW suppressionRing-a-Bell
ASR5.1
18
NSFW suppressionUnlearn DiffAtk
ASR16.4
18
Safe generation against nudity promptsRing-a-Bell
Attack Success Rate (ASR)5.1
9
Safe generation against nudity promptsUnlearnDiff
ASR16.4
9
Image quality preservation on benign promptsCOCO
FID23.73
9
Image Generation Memorization MitigationImageNette memorized SD v2.1 (test)
Similarity @ 95%32.8
3
Showing 8 of 8 rows

Other info

Follow for update