Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection

About

In recent years, significant progress has been made in both image generation and generated image detection. Despite their rapid, yet largely independent, development, these two fields have evolved distinct architectural paradigms: the former predominantly relies on generative networks, while the latter favors discriminative frameworks. A recent trend in both domains is the use of adversarial information to enhance performance, revealing potential for synergy. However, the significant architectural divergence between them presents considerable challenges. Departing from previous approaches, we propose UniGenDet: a Unified generative-discriminative framework for co-evolutionary image Generation and generated image Detection. To bridge the task gap, we design a symbiotic multimodal self-attention mechanism and a unified fine-tuning algorithm. This synergy allows the generation task to improve the interpretability of authenticity identification, while authenticity criteria guide the creation of higher-fidelity images. Furthermore, we introduce a detector-informed generative alignment mechanism to facilitate seamless information exchange. Extensive experiments on multiple datasets demonstrate that our method achieves state-of-the-art performance. Code: \href{https://github.com/Zhangyr2022/UniGenDet}{https://github.com/Zhangyr2022/UniGenDet}.

Yanran Zhang, Wenzhao Zheng, Yifei Li, Bingyao Yu, Yu Zheng, Lei Chen, Jiwen Lu, Jie Zhou• 2026

Related benchmarks

TaskDatasetResultRank
Text-to-Image GenerationGenEval--
704
Synthetic Image DetectionDMimage (Overall)
Accuracy98.6
18
Synthetic Image Detection and Artifact ExplanationFakeClue
Accuracy98
16
Synthetic Image DetectionDMimage (Real)
Accuracy99
9
Synthetic Image DetectionDMimage (Fake)
Accuracy97.2
9
Synthetic Image DetectionARForensics
LlamaGen Accuracy89.4
9
Text-to-Image GenerationLAION 5,000 prompts
FID17.5
3
Text-to-Image Generation DiversityLAION 500 prompts
CLIP Similarity0.802
2
Showing 8 of 8 rows

Other info

GitHub

Follow for update