Unmasking DeepFakes with simple Features

Authors: Ricard Durall, Margret Keuper, Franz-Josef Pfreundt, Janis Keuper

Published: 2019-11-02 09:42:25+00:00

AI Summary

This paper proposes a novel deepfake detection method based on classical frequency domain analysis using Discrete Fourier Transform and azimuthal averaging, followed by a simple classifier (SVM, Logistic Regression, or K-Means). The approach achieves high accuracy with limited labeled data, even performing well in unsupervised scenarios.

Abstract

Deep generative models have recently achieved impressive results for many real-world applications, successfully generating high-resolution and diverse samples from complex datasets. Due to this improvement, fake digital contents have proliferated growing concern and spreading distrust in image content, leading to an urgent need for automated ways to detect these AI-generated fake images. Despite the fact that many face editing algorithms seem to produce realistic human faces, upon closer examination, they do exhibit artifacts in certain domains which are often hidden to the naked eye. In this work, we present a simple way to detect such fake face images - so-called DeepFakes. Our method is based on a classical frequency domain analysis followed by basic classifier. Compared to previous systems, which need to be fed with large amounts of labeled data, our approach showed very good results using only a few annotated training samples and even achieved good accuracies in fully unsupervised scenarios. For the evaluation on high resolution face images, we combined several public datasets of real and fake faces into a new benchmark: Faces-HQ. Given such high-resolution images, our approach reaches a perfect classification accuracy of 100% when it is trained on as little as 20 annotated samples. In a second experiment, in the evaluation of the medium-resolution images of the CelebA dataset, our method achieves 100% accuracy supervised and 96% in an unsupervised setting. Finally, evaluating a low-resolution video sequences of the FaceForensics++ dataset, our method achieves 91% accuracy detecting manipulated videos. Source Code: https://github.com/cc-hpc-itwm/DeepFakeDetection


Key findings
The proposed method achieved 100% accuracy on high- and medium-resolution images (Faces-HQ and CelebA) with as few as 20 labeled training samples. On low-resolution videos (FaceForensics++), it achieved 91% accuracy in detecting manipulated videos. The method's performance is robust across different datasets and shows good results even in unsupervised settings.
Approach
The method extracts features from images by applying a Discrete Fourier Transform (DFT) and then azimuthal averaging to create a 1D power spectrum. A simple classifier (SVM, Logistic Regression, or K-Means) is then trained on these features to distinguish real from fake images.
Datasets
Faces-HQ (a new benchmark dataset created by combining CelebA-HQ, Flickr-Faces-HQ, 100K Faces project, and images from thispersondoesnotexist.com), CelebA, FaceForensics++ (specifically the DeepFakeDetection dataset)
Model(s)
SVM, Logistic Regression, K-Means
Author countries
Germany