Deepfake Detection and the Impact of Limited Computing Capabilities

Authors: Paloma Cantero-Arjona, Alfonso Sánchez-Macián

Published: 2024-02-08 11:04:34+00:00

AI Summary

This research investigates deepfake detection using deep learning models under limited computing resources. The study analyzes the applicability of different deep learning techniques, including 3D CNNs and Vision Transformers, and explores methods to improve their efficiency in this constrained environment.

Abstract

The rapid development of technologies and artificial intelligence makes deepfakes an increasingly sophisticated and challenging-to-identify technique. To ensure the accuracy of information and control misinformation and mass manipulation, it is of paramount importance to discover and develop artificial intelligence models that enable the generic detection of forged videos. This work aims to address the detection of deepfakes across various existing datasets in a scenario with limited computing resources. The goal is to analyze the applicability of different deep learning techniques under these restrictions and explore possible approaches to enhance their efficiency.


Key findings
3D CNN models proved impractical due to high computational demands. Vision Transformers showed some promise, achieving up to 67.56% precision after hyperparameter tuning; however, performance was limited by computational constraints, indicating a need for further optimization or alternative approaches for resource-limited settings.
Approach
The researchers used a (2+1)D CNN (ResNet18) and a Vision Transformer (ViViT with a factorized encoder) for deepfake detection. They preprocessed videos by extracting faces using RetinaFace and explored hyperparameter tuning (learning rate, number of frames, dropout) to optimize performance within resource constraints.
Datasets
UADFV, DeepfakeTIMIT (LQ and HQ), DFDC, and FaceForensics++
Model(s)
(2+1)D CNN (ResNet18) and Vision Transformer (ViViT)
Author countries
Spain