HydraPrompt: An Adaptive and Asymmetric Framework of Vision-Language Models for Synthetic Image Detection

About

The rapid evolution of generative models has precipitated a proliferation of fabricated content, posing significant challenges to existing Synthetic Image Detection (SID) methods. Capitalizing on advancements in vision-language models (e.g., CLIP), recent attempts have leveraged learnable textual prompts to identify synthetic images. However, they still leverage static prompt as a fixed boundary for real and fake images, failing to adapt to the varying types of forgery that emerge during inference. To overcome this issue, we propose **HydraPrompt**, an asymmetric prompting framework that dynamically adjusts the category centers by aligning with fine-grained image cues. Specifically, we propose an Asymmetric Prompt Adapter (**APA**): (1) for authentic category, we introduce a single set of prompts to capture the consistent representative patterns, which serves as a unified anchor for real content. While (2) for fake category, we construct sample-adaptive prompts that specialize in capturing diverse cues from different samples, enabling adaptive modeling of forgery image variations. To increase pronounced discriminability within different synthetic images, we further introduce a Conditional Supervised Contrastive (**CSC**) objective, which compacts the authentic representations while capturing fine-grained forgery clues. Extensive experiments on popular SID benchmarks demonstrate the state-of-the-art performance of our framework.

Senyuan Shi, Hao Tan, Zichang Tan, Shuhan Feng, Ajian Liu, Sergio Escalera, Jun Wan• 2026

Related benchmarks

Task	Dataset	Result
AI-generated image detection	Chameleon (test)	Accuracy69.7	109
AI-generated image detection	WildRF Reddit (test)	Accuracy95.3	19
AI-generated image detection	WildRF (Facebook) (test)	Accuracy95.2	19
AI-generated image detection	WildRF Twitter (test)	Accuracy97.3	19
Synthetic Image Detection	UniversalFakeDetect Guided 49 (test)	Accuracy89.5	12
Synthetic Image Detection	UniversalFakeDetect LDM 200 steps 49 (test)	Accuracy99.5	12
Synthetic Image Detection	UniversalFakeDetect LDM 100 steps 49 (test)	Accuracy99.6	12
Synthetic Image Detection	UniversalFakeDetect Mean 49 (test)	Accuracy95.9	12
Synthetic Image Detection	UniversalFakeDetect DALL-E 49 (test)	Accuracy98.4	12
Synthetic Image Detection	UniversalFakeDetect LDM 200 w/cfg 49 (test)	Accuracy97.3	12

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord