Learning Self-Consistency for Deepfake Detection

View on arXiv ← Back to list

Authors: Tianchen Zhao, Xiang Xu, Mingze Xu, Hui Ding, Yuanjun Xiong, Wei Xia

Published: 2020-12-16 23:06:56+00:00

AI Summary

This paper presents a novel deepfake detection method, Pair-wise Self-Consistency Learning (PCL), that leverages inconsistencies in source features within forged images. PCL trains a convolutional neural network to extract these features and uses a consistency loss to improve detection accuracy, achieving significant improvements over the state-of-the-art on multiple datasets.

Abstract

We propose a new method to detect deepfake images using the cue of the source feature inconsistency within the forged images. It is based on the hypothesis that images' distinct source features can be preserved and extracted after going through state-of-the-art deepfake generation processes. We introduce a novel representation learning approach, called pair-wise self-consistency learning (PCL), for training ConvNets to extract these source features and detect deepfake images. It is accompanied by a new image synthesis approach, called inconsistency image generator (I2G), to provide richly annotated training data for PCL. Experimental results on seven popular datasets show that our models improve averaged AUC over the state of the art from 96.45% to 98.05% in the in-dataset evaluation and from 86.03% to 92.18% in the cross-dataset evaluation.

Key findings

The proposed PCL method significantly improves deepfake detection accuracy, achieving near-perfect performance on some datasets in in-dataset evaluations and substantial improvements in cross-dataset evaluations. The method also demonstrates good generalization across different datasets and forgery techniques.

Approach

The approach uses a convolutional neural network to extract source features from images. A pair-wise self-consistency loss is used during training to identify inconsistencies in these features, which indicate deepfakes. An inconsistency image generator (I2G) creates synthetic data for training.

Datasets

FF++, CD2, DFDC-P, DFD, DFR, CD1, DFDC

Model(s)

ResNet-34 (with modifications for PCL)

Author countries

USA, China

← Previous