Using Deep Learning to Detecting Deepfakes

Authors: Jacob Mallet, Rushit Dave, Naeem Seliya, Mounika Vanamala

Published: 2022-07-27 17:05:16+00:00

AI Summary

This research paper provides a survey of deep learning models used for deepfake detection, focusing on various approaches, benefits, limitations, and future research directions. The study reviews models that leverage temporal features, biological signals, or a combination of both for identifying inconsistencies in deepfake videos and images.

Abstract

In the recent years, social media has grown to become a major source of information for many online users. This has given rise to the spread of misinformation through deepfakes. Deepfakes are videos or images that replace one persons face with another computer-generated face, often a more recognizable person in society. With the recent advances in technology, a person with little technological experience can generate these videos. This enables them to mimic a power figure in society, such as a president or celebrity, creating the potential danger of spreading misinformation and other nefarious uses of deepfakes. To combat this online threat, researchers have developed models that are designed to detect deepfakes. This study looks at various deepfake detection models that use deep learning algorithms to combat this looming threat. This survey focuses on providing a comprehensive overview of the current state of deepfake detection models and the unique approaches many researchers take to solving this problem. The benefits, limitations, and suggestions for future work will be thoroughly discussed throughout this paper.


Key findings
Models using temporal features generally achieved higher accuracies compared to those using primarily biological features. While many models demonstrated high accuracy, transferability across different datasets remains a significant challenge. The use of CNNs and other deep learning algorithms shows promise in deepfake detection.
Approach
The paper surveys existing deepfake detection models, categorizing them by approach: analyzing inconsistencies in facial feature movement across frames (temporal), detecting biological signs of life (biological), or using multimodal information (audio and other features). The paper does not propose a novel model itself.
Datasets
Celeb-DF, Celeb-DF v2, FaceForensics++, DeepfakeTIMIT
Model(s)
Convolutional Neural Networks (CNNs), Long Short-Term Memory networks (LSTMs), Multitask Cascaded Convolutional Neural Networks, VGG16, VGG19, ResNet18, Convolutional Attention Network (CAN), 3D convolutional networks, Triplet networks
Author countries
USA