Deepfake Detection of Occluded Images Using a Patch-based Approach

Authors: Mahsa Soleimani, Ali Nazari, Mohsen Ebrahimi Moghaddam

Published: 2023-04-10 12:12:14+00:00

AI Summary

This paper proposes a deep learning approach for deepfake detection in occluded images, utilizing a three-path decision mechanism. The approach combines whole-face analysis with patch-based analysis (concatenation and majority voting of patch features) to improve robustness against occlusions, achieving a significant improvement over state-of-the-art results.

Abstract

DeepFake involves the use of deep learning and artificial intelligence techniques to produce or change video and image contents typically generated by GANs. Moreover, it can be misused and leads to fictitious news, ethical and financial crimes, and also affects the performance of facial recognition systems. Thus, detection of real or fake images is significant specially to authenticate originality of people's images or videos. One of the most important challenges in this topic is obstruction that decreases the system precision. In this study, we present a deep learning approach using the entire face and face patches to distinguish real/fake images in the presence of obstruction with a three-path decision: first entire-face reasoning, second a decision based on the concatenation of feature vectors of face patches, and third a majority vote decision based on these features. To test our approach, new datasets including real and fake images are created. For producing fake images, StyleGAN and StyleGAN2 are trained by FFHQ images and also StarGAN and PGGAN are trained by CelebA images. The CelebA and FFHQ datasets are used as real images. The proposed approach reaches higher results in early epochs than other methods and increases the SoTA results by 0.4%-7.9% in the different built data-sets. Also, we have shown in experimental results that weighing the patches may improve accuracy.


Key findings
The proposed method outperforms existing methods, increasing state-of-the-art results by 0.4%-7.9% across different datasets. Occlusion removal significantly improves accuracy, particularly in datasets with high occlusion rates. Patch weighting further enhances performance.
Approach
The method uses a three-path decision system: analyzing the whole face, concatenating feature vectors from facial patches, and performing a majority vote on patch classifications. It employs a Gram-Net network for feature extraction and addresses occlusions through detection and removal of occluded areas.
Datasets
CelebA, FFHQ, datasets generated using StyleGAN, StyleGAN2, StarGAN, and PGGAN.
Model(s)
Gram-Net network, SGD Optimizer, cross-entropy loss function.
Author countries
Iran