Bridging the Micro--Macro Gap: Frequency-Aware Semantic Alignment for Image Manipulation Localization
About
As generative image editing advances, image manipulation localization (IML) must handle both traditional manipulations with conspicuous forensic artifacts and diffusion-generated edits that appear locally realistic. Existing methods typically rely on either low-level forensic cues or high-level semantics alone, leading to a fundamental micro--macro gap. To bridge this gap, we propose FASA, a unified framework for localizing both traditional and diffusion-generated manipulations. Specifically, we extract manipulation-sensitive frequency cues through an adaptive dual-band DCT module and learn manipulation-aware semantic priors via patch-level contrastive alignment on frozen CLIP representations. We then inject these priors into a hierarchical frequency pathway through a semantic-frequency side adapter for multi-scale feature interaction, and employ a prototype-guided, frequency-gated mask decoder to integrate semantic consistency with boundary-aware localization for tampered region prediction. Extensive experiments on OpenSDI and multiple traditional manipulation benchmarks demonstrate state-of-the-art localization performance, strong cross-generator and cross-dataset generalization, and robust performance under common image degradations.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Manipulation Localization | NIST16 | F1 Score40.92 | 75 | |
| Image Manipulation Localization | Coverage | F1 Score65.37 | 49 | |
| Pixel-level Forgery Localization | Columbia | F190.57 | 20 | |
| Image-level detection | OpenSDI | SD1.5 F1 Score93.75 | 15 | |
| Image Manipulation Localization | OpenSDI SD1.5 | F1 Score81.19 | 9 | |
| Image Manipulation Localization | OpenSDI SD2.1 | F1 Score72.71 | 9 | |
| Image Manipulation Localization | OpenSDI SDXL | F1 Score49.14 | 9 | |
| Image Manipulation Localization | OpenSDI SD3 | F1 Score61.74 | 9 | |
| Image Manipulation Localization | OpenSDI Flux.1 | F1 Score24.37 | 9 | |
| Image Manipulation Localization | OpenSDI Average | F1 Score57.83 | 9 |