Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection

View on arXiv ← Back to list

Authors: Yuzhen Lin, Wentang Song, Bin Li, Yuezun Li, Jiangqun Ni, Han Chen, Qiushi Li

Published: 2024-09-22 13:51:22+00:00

AI Summary

This paper introduces CDFA, a novel deepfake detection method that jointly trains a detector with a forgery augmentation policy network. CDFA progressively applies forgery augmentations following a monotonic curriculum and dynamically selects augmentations for optimal generalization, significantly improving cross-dataset and cross-manipulation performance.

Abstract

Previous studies in deepfake detection have shown promising results when testing face forgeries from the same dataset as the training. However, the problem remains challenging when one tries to generalize the detector to forgeries from unseen datasets and created by unseen methods. In this work, we present a novel general deepfake detection method, called textbf{C}urricular textbf{D}ynamic textbf{F}orgery textbf{A}ugmentation (CDFA), which jointly trains a deepfake detector with a forgery augmentation policy network. Unlike the previous works, we propose to progressively apply forgery augmentations following a monotonic curriculum during the training. We further propose a dynamic forgery searching strategy to select one suitable forgery augmentation operation for each image varying between training stages, producing a forgery augmentation policy optimized for better generalization. In addition, we propose a novel forgery augmentation named self-shifted blending image to simply imitate the temporal inconsistency of deepfake generation. Comprehensive experiments show that CDFA can significantly improve both cross-datasets and cross-manipulations performances of various naive deepfake detectors in a plug-and-play way, and make them attain superior performances over the existing methods in several benchmark datasets.

Key findings

CDFA significantly improves cross-dataset and cross-manipulation deepfake detection performance compared to state-of-the-art methods. The proposed Self-shifted Blending Image (SSBI) augmentation and dynamic forgery search strategy are key contributors to this improvement. The results show that CDFA is applicable to various backbone models.

Approach

CDFA uses a curriculum learning approach, progressively increasing the proportion of pseudo-fake samples (generated by a dynamic forgery augmentation policy network) during training. This policy network learns to select the best forgery augmentation operation for each image at each training stage, improving generalization.

Datasets

FaceForensics++ (FF++), Celeb-DF-v2 (CDF), DeepFake Detection Challenge preview (DFDCP), DeepFake Detection Challenge public (DFDC), WildDeepfake (Wild)

Model(s)

SwinTransformerV2-Base (Swin), Xception (Xcep), EfficientNetb4 (ENb4)

Author countries

China

← Previous