Fake face detection via adaptive manipulation traces extraction network

Authors: Zhiqing Guo, Gaobo Yang, Jiyou Chen, Xingming Sun

Published: 2020-05-11 09:16:39+00:00

AI Summary

This paper proposes AMTEN, an adaptive manipulation traces extraction network for pre-processing images before deepfake detection. AMTENnet, a deepfake detector integrating AMTEN with a CNN, achieves an average accuracy of up to 98.52% on various deepfake datasets, outperforming state-of-the-art methods, even under complex scenarios with post-processing operations.

Abstract

With the proliferation of face image manipulation (FIM) techniques such as Face2Face and Deepfake, more fake face images are spreading over the internet, which brings serious challenges to public confidence. Face image forgery detection has made considerable progresses in exposing specific FIM, but it is still in scarcity of a robust fake face detector to expose face image forgeries under complex scenarios such as with further compression, blurring, scaling, etc. Due to the relatively fixed structure, convolutional neural network (CNN) tends to learn image content representations. However, CNN should learn subtle manipulation traces for image forensics tasks. Thus, we propose an adaptive manipulation traces extraction network (AMTEN), which serves as pre-processing to suppress image content and highlight manipulation traces. AMTEN exploits an adaptive convolution layer to predict manipulation traces in the image, which are reused in subsequent layers to maximize manipulation artifacts by updating weights during the back-propagation pass. A fake face detector, namely AMTENnet, is constructed by integrating AMTEN with CNN. Experimental results prove that the proposed AMTEN achieves desirable pre-processing. When detecting fake face images generated by various FIM techniques, AMTENnet achieves an average accuracy up to 98.52%, which outperforms the state-of-the-art works. When detecting face images with unknown post-processing operations, the detector also achieves an average accuracy of 95.17%.


Key findings
AMTENnet achieves significantly higher accuracy (up to 98.52%) than state-of-the-art methods in detecting deepfakes generated by various techniques. The model also shows robustness against post-processing operations like compression and blurring, maintaining high accuracy (95.17%). The adaptive nature of AMTEN proves superior to fixed predictors for extracting manipulation traces.
Approach
The authors propose AMTEN, which adaptively learns manipulation traces by updating weights during backpropagation, highlighting these traces while suppressing image content. These traces are then fed into a CNN (AMTENnet) for deepfake classification.
Datasets
HFF dataset (a hybrid fake face dataset including CelebA, CelebA-HQ, YouTube-Frame, PGGAN, StyleGAN, Glow, Face2Face, and StarGAN generated images), FaceForensics++ (FF++) dataset (including Deepfakes, Face2Face, FaceSwap, and NeuralTextures manipulated videos with high and low quality compression).
Model(s)
AMTENnet (a CNN integrating the proposed AMTEN module for adaptive manipulation trace extraction). The paper also compares with Meso-4, MesoInception-4, Hand-Crafted-Res, MISLnet, and XceptionNet.
Author countries
China