Can We Build a Monolithic Model for Fake Image Detection? SICA: Semantic-Induced Constrained Adaptation for Unified-Yet-Discriminative Artifact Feature Space Reconstruction
About
Fake Image Detection (FID), aiming at unified detection across four image forensic subdomains, is critical in real-world forensic scenarios. Compared with ensemble approaches, monolithic FID models are theoretically more promising, but to date, consistently yield inferior performance in practice. In this work, by discovering the ``heterogeneous phenomenon'', which is the intrinsic distinctness of artifacts across subdomains, we diagnose the cause of this underperformance for the first time: the collapse of the artifact feature space driven by such phenomenon. The core challenge for developing a practical monolithic FID model thus boils down to the ``unified-yet-discriminative" reconstruction of the artifact feature space. To address this paradoxical challenge, we hypothesize that high-level semantics can serve as a structural prior for the reconstruction, and further propose Semantic-Induced Constrained Adaptation (SICA), the first monolithic FID paradigm. Extensive experiments on our OpenMMSec dataset demonstrate that SICA outperforms 15 state-of-the-art methods and reconstructs the target unified-yet-discriminative artifact feature space in a near-orthogonal manner, thus firmly validating our hypothesis. The code and dataset are available at:https: //github.com/scu-zjz/SICA_OpenMMSec.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Artifact Detection | OpenMMSec | Deepfake EFS95.9 | 68 | |
| Deepfake Detection | FaceForensics++ c23 (test) | AUC95.5 | 26 | |
| Image Forgery Detection | ForensicHub IFF-Protocol v2025 (test) | FF-c400.825 | 23 | |
| Deepfake Detection | DeepfakeBench Cross Domain multiple (test) | AUC (CDFv1)90.67 | 18 | |
| AIGC Detection | GenImage v1 (test) | Midjourney Accuracy65.5 | 8 |