An Experimental Evaluation on Deepfake Detection using Deep Face Recognition

Authors: Sreeraj Ramachandran, Aakash Varma Nadimpalli, Ajita Rattani

Published: 2021-10-04 18:02:56+00:00

AI Summary

This paper evaluates the effectiveness of deep face recognition for deepfake detection, comparing its performance against traditional two-class CNNs and ocular-based methods. Experiments on Celeb-DF and FaceForensics++ datasets show that deep face recognition achieves significantly higher accuracy (AUC of 0.98 and EER of 7.1% on Celeb-DF), particularly for identity-swapping deepfakes.

Abstract

Significant advances in deep learning have obtained hallmark accuracy rates for various computer vision applications. However, advances in deep generative models have also led to the generation of very realistic fake content, also known as deepfakes, causing a threat to privacy, democracy, and national security. Most of the current deepfake detection methods are deemed as a binary classification problem in distinguishing authentic images or videos from fake ones using two-class convolutional neural networks (CNNs). These methods are based on detecting visual artifacts, temporal or color inconsistencies produced by deep generative models. However, these methods require a large amount of real and fake data for model training and their performance drops significantly in cross dataset evaluation with samples generated using advanced deepfake generation techniques. In this paper, we thoroughly evaluate the efficacy of deep face recognition in identifying deepfakes, using different loss functions and deepfake generation techniques. Experimental investigations on challenging Celeb-DF and FaceForensics++ deepfake datasets suggest the efficacy of deep face recognition in identifying deepfakes over two-class CNNs and the ocular modality. Reported results suggest a maximum Area Under Curve (AUC) of 0.98 and an Equal Error Rate (EER) of 7.1% in detecting deepfakes using face recognition on the Celeb-DF dataset. This EER is lower by 16.6% compared to the EER obtained for the two-class CNN and the ocular modality on the Celeb-DF dataset. Further on the FaceForensics++ dataset, an AUC of 0.99 and EER of 2.04% were obtained. The use of biometric facial recognition technology has the advantage of bypassing the need for a large amount of fake data for model training and obtaining better generalizability to evolving deepfake creation techniques.


Key findings
Deep face recognition significantly outperforms two-class CNNs and ocular-based methods in deepfake detection, especially for identity-swapping techniques. The best performance was achieved using CosFace and Combined margin loss functions. Expression-swapping deepfakes were harder to detect with this approach.
Approach
The authors leverage pre-trained deep face recognition models (ResNet-50) trained on large facial recognition datasets. Deepfake detection is performed by comparing feature vectors extracted from the deepfake video frames with the authentic templates of the subjects, using cosine similarity.
Datasets
Celeb-DF and FaceForensics++
Model(s)
ResNet-50 (pre-trained on MS1M-ArcFace and WebFace12M datasets)
Author countries
USA