Detecting Deepfake by Creating Spatio-Temporal Regularity Disruption

Authors: Jiazhi Guan, Hang Zhou, Mingming Gong, Errui Ding, Jingdong Wang, Youjian Zhao

Published: 2022-07-21 10:42:34+00:00

AI Summary

This paper proposes a novel deepfake detection method that focuses on disrupting the spatio-temporal regularity of real videos to create pseudo-fake videos for training. The method leverages a Pseudo-fake Generator and a Spatio-Temporal Enhancement block to learn these disruptions, improving generalization without using actual fake videos.

Abstract

Despite encouraging progress in deepfake detection, generalization to unseen forgery types remains a significant challenge due to the limited forgery clues explored during training. In contrast, we notice a common phenomenon in deepfake: fake video creation inevitably disrupts the statistical regularity in original videos. Inspired by this observation, we propose to boost the generalization of deepfake detection by distinguishing the regularity disruption that does not appear in real videos. Specifically, by carefully examining the spatial and temporal properties, we propose to disrupt a real video through a Pseudo-fake Generator and create a wide range of pseudo-fake videos for training. Such practice allows us to achieve deepfake detection without using fake videos and improves the generalization ability in a simple and efficient manner. To jointly capture the spatial and temporal disruptions, we propose a Spatio-Temporal Enhancement block to learn the regularity disruption across space and time on our self-created videos. Through comprehensive experiments, our method exhibits excellent performance on several datasets.


Key findings
The proposed method exhibits excellent performance on several datasets, showing significant improvements in generalization ability compared to existing methods. Specifically, a 10.67% increase in AUC was observed on the Deepwild dataset. The method achieves this without using actual deepfake videos for training.
Approach
The approach trains a deepfake detector using only real videos, modified by a Pseudo-fake Generator to introduce spatio-temporal irregularities mimicking those found in deepfakes. A Spatio-Temporal Enhancement block is used to capture these irregularities, improving the model's ability to generalize to unseen deepfake types.
Datasets
UNKNOWN. The abstract mentions that the method shows excellent performance on several datasets, but doesn't specify which ones.
Model(s)
UNKNOWN. The architecture of the deepfake detector is not explicitly specified, only the Spatio-Temporal Enhancement block is detailed.
Author countries
UNKNOWN