Detecting Deepfake Videos: An Analysis of Three Techniques

Authors: Armaan Pishori, Brittany Rollins, Nicolas van Houten, Nisha Chatwani, Omar Uraimov

Published: 2020-07-15 20:36:23+00:00

Comment: 11 pages, 8 figures, 2 tables

AI Summary

This paper analyzes three techniques for deepfake video detection: convolutional LSTM (CNN+RNN), eye blink detection, and grayscale histograms, developed during participation in the Deepfake Detection Challenge. The research found that the grayscale histogram technique demonstrated the highest accuracy among the methods explored, highlighting the importance of preprocessing for identifying deepfake artifacts.

Abstract

Recent advances in deepfake generating algorithms that produce manipulated media have had dangerous implications in privacy, security and mass communication. Efforts to combat this issue have risen in the form of competitions and funding for research to detect deepfakes. This paper presents three techniques and algorithms: convolutional LSTM, eye blink detection and grayscale histograms-pursued while participating in the Deepfake Detection Challenge. We assessed the current knowledge about deepfake videos, a more severe version of manipulated media, and previous methods used, and found relevance in the grayscale histogram technique over others. We discussed the implications of each method developed and provided further steps to improve the given findings.


Key findings
The grayscale histogram method achieved the highest accuracy at 85.71%, outperforming CNN+RNN (82.20%) and eye blink detection (81.67%). All tested models showed accuracies in the 80-90% range, indicating the general effectiveness of these techniques for deepfake video detection, despite limitations from computational resources and dataset size.
Approach
The authors explore three methods: a CNN+RNN model for spatiotemporal feature extraction, an eye blink detection system using OpenCV and a KNN classifier to identify unnatural blink rates, and a grayscale histogram approach leveraging an LSTM to analyze temporal changes in video spectral responses.
Datasets
Kaggle Deepfake Detection Challenge dataset (500 GB of video data, although only ~50GB was used for training due to computational limits).
Model(s)
Convolutional Neural Network (CNN) + Recurrent Neural Network (RNN), K-Nearest Neighbors (KNN) Classifier, Long Short-Term Memory (LSTM) with additional neural network layers.
Author countries
United States