Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Spatiotemporal Inconsistency Learning for DeepFake Video Detection

About

The rapid development of facial manipulation techniques has aroused public concerns in recent years. Following the success of deep learning, existing methods always formulate DeepFake video detection as a binary classification problem and develop frame-based and video-based solutions. However, little attention has been paid to capturing the spatial-temporal inconsistency in forged videos. To address this issue, we term this task as a Spatial-Temporal Inconsistency Learning (STIL) process and instantiate it into a novel STIL block, which consists of a Spatial Inconsistency Module (SIM), a Temporal Inconsistency Module (TIM), and an Information Supplement Module (ISM). Specifically, we present a novel temporal modeling paradigm in TIM by exploiting the temporal difference over adjacent frames along with both horizontal and vertical directions. And the ISM simultaneously utilizes the spatial information from SIM and temporal information from TIM to establish a more comprehensive spatial-temporal representation. Moreover, our STIL block is flexible and could be plugged into existing 2D CNNs. Extensive experiments and visualizations are presented to demonstrate the effectiveness of our method against the state-of-the-art competitors.

Zhihao Gu, Yang Chen, Taiping Yao, Shouhong Ding, Jilin Li, Feiyue Huang, Lizhuang Ma• 2021

Related benchmarks

TaskDatasetResultRank
AI-generated Video DetectionEA-Video seen (evaluation)
Accuracy89.5
88
Deepfake DetectionCelebDF (CDF) v2 (test)
AUC75.6
52
AI-generated Video DetectionEvalCrafter
Floor33 Score90.42
42
AI-generated Video DetectionVideoPhy 1.0 (test)
CVX Score71.73
42
AI-generated Video DetectionVidProm
AUC (MS)43.68
42
Synthetic Video DetectionGenVideo (test)
Average Detection Rate74.37
34
Video Forgery DetectionGenVideo (test)
Recall (Average)73.83
31
AI-generated Video DetectionEvalCrafter 14 subsets (test)
Floor33 Score90.42
28
AI-generated Video DetectionVideoPhy
CVX AUC71.73
28
AI-generated Video DetectionEA-Video (test)
Accuracy74.8
24
Showing 10 of 39 rows

Other info

Follow for update