Deepfake Detection: A Comparative Analysis

Authors: Sohail Ahmed Khan, Duc-Tien Dang-Nguyen

Published: 2023-08-07 10:57:20+00:00

AI Summary

This paper presents a comparative analysis of supervised and self-supervised deep learning models for deepfake detection. The study evaluates the performance and generalization capabilities of various architectures (CNNs and Transformers) across four benchmark datasets, providing insights into model selection and training strategies for improved deepfake detection.

Abstract

This paper present a comprehensive comparative analysis of supervised and self-supervised models for deepfake detection. We evaluate eight supervised deep learning architectures and two transformer-based models pre-trained using self-supervised strategies (DINO, CLIP) on four benchmarks (FakeAVCeleb, CelebDF-V2, DFDC, and FaceForensics++). Our analysis includes intra-dataset and inter-dataset evaluations, examining the best performing models, generalisation capabilities, and impact of augmentations. We also investigate the trade-off between model size and performance. Our main goal is to provide insights into the effectiveness of different deep learning architectures (transformers, CNNs), training strategies (supervised, self-supervised), and deepfake detection benchmarks. These insights can help guide the development of more accurate and reliable deepfake detection systems, which are crucial in mitigating the harmful impact of deepfakes on individuals and society.


Key findings
Models with multi-scale feature processing generally performed best in intra-dataset evaluations. Transformer models showed superior generalization capabilities in inter-dataset evaluations. The DFDC dataset proved most challenging for training, while FaceForensics++ offered the best generalization, suggesting that more challenging datasets lead to better generalization.
Approach
The researchers trained and evaluated eight supervised and two self-supervised deep learning models (CNNs and Transformers) on four deepfake detection datasets. They conducted both intra-dataset and inter-dataset evaluations to assess performance and generalization capabilities, analyzing the impact of augmentations and model size.
Datasets
FakeAVCeleb, CelebDF-V2, DFDC, and FaceForensics++
Model(s)
Xception, Res2Net-101, EfficientNet-B7, ViT, Swin Transformer, MViT, ResNet-3D, TimeSformer, DINO (self-supervised ViT), CLIP (self-supervised ViT)
Author countries
Norway