Detecting Deepfake-Forged Contents with Separable Convolutional Neural Network and Image Segmentation

Authors: Chia-Mu Yu, Ching-Tang Chang, Yen-Wu Ti

Published: 2019-12-21 08:32:27+00:00

AI Summary

This paper proposes a facial forgery detection method using a separable convolutional neural network (CNN) and image segmentation. The approach segments images into patches, analyzes them individually with the CNN, and uses an ensemble model for improved detection capabilities.

Abstract

Recent advances in AI technology have made the forgery of digital images and videos easier, and it has become significantly more difficult to identify such forgeries. These forgeries, if disseminated with malicious intent, can negatively impact social and political stability, and pose significant ethical and legal challenges as well. Deepfake is a variant of auto-encoders that use deep learning techniques to identify and exchange images of a person's face in a picture or film. Deepfake can result in an erosion of public trust in digital images and videos, which has far-reaching effects on political and social stability. This study therefore proposes a solution for facial forgery detection to determine if a picture or film has ever been processed by Deepfake. The proposed solution reaches detection efficiency by using the recently proposed separable convolutional neural network (CNN) and image segmentation. In addition, this study also examined how different image segmentation methods affect detection results. Finally, the ensemble model is used to improve detection capabilities. Experiment results demonstrated the excellent performance of the proposed solution.


Key findings
The proposed method achieved higher accuracy and AUC values compared to Mesonet and Capsule-Forensics, particularly on datasets with different feature distributions. Image segmentation significantly improved the model's performance, with five segmentations yielding optimal results.
Approach
The method segments facial images into patches and uses a separable CNN to extract features from each patch. An ensemble model combines these features to classify the image as real or fake using a voting mechanism.
Datasets
FaceForensics++, DeepFaceLab, StyleGAN
Model(s)
Separable Convolutional Neural Network (CNN), Mesonet, Capsule-Forensics
Author countries
Taiwan