Unmasking Deepfake Faces from Videos Using An Explainable Cost-Sensitive Deep Learning Approach

Authors: Faysal Mahmud, Yusha Abdullah, Minhajul Islam, Tahsin Aziz

Published: 2023-12-17 14:57:10+00:00

AI Summary

This research proposes a cost-sensitive deep learning approach for detecting deepfake faces in videos, leveraging key frame extraction for efficiency and four pre-trained CNN models (XceptionNet, InceptionResNetV2, EfficientNetV2S, and EfficientNetV2M) for improved accuracy. The method addresses dataset imbalance and incorporates Explainable AI techniques for enhanced interpretability.

Abstract

Deepfake technology is widely used, which has led to serious worries about the authenticity of digital media, making the need for trustworthy deepfake face recognition techniques more urgent than ever. This study employs a resource-effective and transparent cost-sensitive deep learning method to effectively detect deepfake faces in videos. To create a reliable deepfake detection system, four pre-trained Convolutional Neural Network (CNN) models: XceptionNet, InceptionResNetV2, EfficientNetV2S, and EfficientNetV2M were used. FaceForensics++ and CelebDf-V2 as benchmark datasets were used to assess the performance of our method. To efficiently process video data, key frame extraction was used as a feature extraction technique. Our main contribution is to show the models adaptability and effectiveness in correctly identifying deepfake faces in videos. Furthermore, a cost-sensitive neural network method was applied to solve the dataset imbalance issue that arises frequently in deepfake detection. The XceptionNet model on the CelebDf-V2 dataset gave the proposed methodology a 98% accuracy, which was the highest possible whereas, the InceptionResNetV2 model, achieves an accuracy of 94% on the FaceForensics++ dataset. Source Code: https://github.com/Faysal-MD/Unmasking-Deepfake-Faces-from-Videos-An-Explainable-Cost-Sensitive-Deep-Learning-Approach-IEEE2023


Key findings
The XceptionNet model achieved 98% accuracy on the CelebDf-V2 dataset, while InceptionResNetV2 reached 94% accuracy on the FaceForensics++ dataset. The use of Explainable AI provided insights into the models' decision-making processes, improving interpretability and robustness. The results surpass several state-of-the-art methods.
Approach
The approach uses key frame extraction to reduce computational cost and improve efficiency. Four pre-trained CNN models are employed, and a cost-sensitive learning method is implemented to handle class imbalance in the datasets. Explainable AI techniques are used to interpret the models' predictions.
Datasets
FaceForensics++, CelebDF-V2
Model(s)
XceptionNet, InceptionResNetV2, EfficientNetV2S, EfficientNetV2M
Author countries
Bangladesh