MARE: Multimodal Alignment and Reinforcement for Explainable Deepfake Detection via Vision-Language Models

About

Deepfake detection is a widely researched topic that is crucial for combating the spread of malicious content, with existing methods mainly modeling the problem as classification or spatial localization. The rapid advancements in generative models impose new demands on Deepfake detection. In this paper, we propose multimodal alignment and reinforcement for explainable Deepfake detection via vision-language models, termed MARE, which aims to enhance the accuracy and reliability of Vision-Language Models (VLMs) in Deepfake detection and reasoning. Specifically, MARE designs comprehensive reward functions, incorporating reinforcement learning from human feedback (RLHF), to incentivize the generation of text-spatially aligned reasoning content that adheres to human preferences. Besides, MARE introduces a forgery disentanglement module to capture intrinsic forgery traces from high-level facial semantics, thereby improving its authenticity detection capability. We conduct thorough evaluations on the reasoning content generated by MARE. Both quantitative and qualitative experimental results demonstrate that MARE achieves state-of-the-art performance in terms of accuracy and reliability.

Wenbo Xu, Wei Lu, Xiangyang Luo, Jiantao Zhou• 2026

Related benchmarks

Task	Dataset	Result
Deepfake Detection	DFDC (test)	AUC99.77	130
Deepfake Detection	DFD (test)	Accuracy98.25	81
Deepfake Detection	FaceForensics++ (test)	AUC98.86	65
Deepfake Detection	FF++ (test)	AUC99.28	44
Deepfake Detection	Celeb-DF (test)	Accuracy100	40
Deepfake Detection	WildDeepfake (test)	AUC0.937	19
Deepfake Detection	DMA dataset (test)	Accuracy0.9809	7
Deepfake Detection	WDF (test)	Accuracy87.72	5

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord