Beyond Semantic Features: Pixel-level Mapping for Generalized AI-Generated Image Detection
About
The rapid evolution of generative technologies necessitates reliable methods for detecting AI-generated images. A critical limitation of current detectors is their failure to generalize to images from unseen generative models, as they often overfit to source-specific semantic cues rather than learning universal generative artifacts. To overcome this, we introduce a simple yet remarkably effective pixel-level mapping pre-processing step to disrupt the pixel value distribution of images and break the fragile, non-essential semantic patterns that detectors commonly exploit as shortcuts. This forces the detector to focus on more fundamental and generalizable high-frequency traces inherent to the image generation process. Through comprehensive experiments on GAN and diffusion-based generators, we show that our approach significantly boosts the cross-generator performance of state-of-the-art detectors. Extensive analysis further verifies our hypothesis that the disruption of semantic cues is the key to generalization.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Generated Image Detection | GenImage (test) | Average Accuracy98.4 | 103 | |
| AI-generated image detection | GenImage | Midjourney Detection Rate91.5 | 65 | |
| Fake Image Detection | UniversalFakeDetect | Guided Score98.5 | 13 | |
| GAN Image Detection | Self-Synthesis 9 GANs | AttGAN Score99.8 | 12 | |
| Synthetic Image Detection | Self-Synthesis 9 GANs (test) | AttGAN Accuracy99.6 | 12 |