Can We Build a Monolithic Model for Fake Image Detection? SICA: Semantic-Induced Constrained Adaptation for Unified-Yet-Discriminative Artifact Feature Space Reconstruction
About
Fake Image Detection (FID), aiming at unified detection across four image forensic subdomains, is critical in real-world forensic scenarios. Compared with ensemble approaches, monolithic FID models are theoretically more promising, but to date, consistently yield inferior performance in practice. In this work, we identify the intrinsic distinctness of artifacts across subdomains, a critical barrier we term the ``Ji-Zhe phenomenon". Driven by this phenomenon, we diagnose the cause of this underperformance for the first time: the collapse of the artifact feature space. The core challenge for developing a practical monolithic FID model thus boils down to the ``unified-yet-discriminative" reconstruction of the artifact feature space. To address this paradoxical challenge, we hypothesize that high-level semantics can serve as a structural prior for the reconstruction, and further propose Semantic-Induced Constrained Adaptation (SICA), the first monolithic FID paradigm. Extensive experiments on our OpenMMSec dataset demonstrate that SICA outperforms 15 state-of-the-art methods and reconstructs the target unified-yet-discriminative artifact feature space in a near-orthogonal manner, thus firmly validating our hypothesis. The code and dataset are available at: https://github.com/venus-guangjian/SICA_OpenMMSec.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Artifact Detection | OpenMMSec | Deepfake EFS95.9 | 68 | |
| Deepfake Detection | FaceForensics++ c23 (test) | AUC95.5 | 52 | |
| Image Forgery Detection | ForensicHub IFF-Protocol v2025 (test) | FF-c400.825 | 23 | |
| Deepfake Detection | DeepfakeBench Cross Domain multiple (test) | AUC (CDFv1)90.67 | 18 | |
| AIGC Detection | GenImage v1 (test) | Midjourney Accuracy65.5 | 8 |