DF40: Toward Next-Generation Deepfake Detection

About

We propose a new comprehensive benchmark to revolutionize the current deepfake detection field to the next generation. Predominantly, existing works identify top-notch detection algorithms and models by adhering to the common practice: training detectors on one specific dataset (e.g., FF++) and testing them on other prevalent deepfake datasets. This protocol is often regarded as a "golden compass" for navigating SoTA detectors. But can these stand-out "winners" be truly applied to tackle the myriad of realistic and diverse deepfakes lurking in the real world? If not, what underlying factors contribute to this gap? In this work, we found the dataset (both train and test) can be the "primary culprit" due to: (1) forgery diversity: Deepfake techniques are commonly referred to as both face forgery and entire image synthesis. Most existing datasets only contain partial types of them, with limited forgery methods implemented; (2) forgery realism: The dominated training dataset, FF++, contains out-of-date forgery techniques from the past four years. "Honing skills" on these forgeries makes it difficult to guarantee effective detection generalization toward nowadays' SoTA deepfakes; (3) evaluation protocol: Most detection works perform evaluations on one type, which hinders the development of universal deepfake detectors. To address this dilemma, we construct a highly diverse deepfake detection dataset called DF40, which comprises 40 distinct deepfake techniques. We then conduct comprehensive evaluations using 4 standard evaluation protocols and 8 representative detection methods, resulting in over 2,000 evaluations. Through these evaluations, we provide an extensive analysis from various perspectives, leading to 7 new insightful findings. We also open up 4 valuable yet previously underexplored research questions to inspire future works. Our project page is https://github.com/YZY-stack/DF40.

Zhiyuan Yan, Taiping Yao, Shen Chen, Yandan Zhao, Xinghe Fu, Junwei Zhu, Donghao Luo, Chengjie Wang, Shouhong Ding, Yunsheng Wu, Li Yuan• 2024

Related benchmarks

Task	Dataset	Result
Deepfake Detection	DFDC	AUC96	230
Deepfake Detection	Celeb-DF	ROC-AUC0.975	48
Deepfake Detection	WildDeepfake	AUC0.853	25
Deepfake Detection	UADFV	Accuracy72	24
Deepfake Detection	FaceForensics++ c40 (test)	AUC95	24
Deepfake Detection	Protocol-1 Cross-Dataset (test)	Average AUC92.7	18
Deepfake Detection	CIFAKE	Accuracy47	7
Deepfake Video Detection	FaceForensics++ c23	AUC-ROC99	4

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord