What's wrong with this video? Comparing Explainers for Deepfake Detection

Authors: Samuele Pino, Mark James Carman, Paolo Bestagini

Published: 2021-05-12 18:44:39+00:00

AI Summary

This research compares different explanation techniques for deepfake detection models. The authors adapt and extend SHAP, GradCAM, and self-attention models to explain the predictions of EfficientNet-based deepfake detectors, proposing new metrics to evaluate explanations and conducting a user survey.

Abstract

Deepfakes are computer manipulated videos where the face of an individual has been replaced with that of another. Software for creating such forgeries is easy to use and ever more popular, causing serious threats to personal reputation and public security. The quality of classifiers for detecting deepfakes has improved with the releasing of ever larger datasets, but the understanding of why a particular video has been labelled as fake has not kept pace. In this work we develop, extend and compare white-box, black-box and model-specific techniques for explaining the labelling of real and fake videos. In particular, we adapt SHAP, GradCAM and self-attention models to the task of explaining the predictions of state-of-the-art detectors based on EfficientNet, trained on the Deepfake Detection Challenge (DFDC) dataset. We compare the obtained explanations, proposing metrics to quantify their visual features and desirable characteristics, and also perform a user survey collecting users' opinions regarding the usefulness of the explainers.


Key findings
The study compares several explanation methods for deepfake detection, proposing new evaluation metrics. Results demonstrate that certain techniques provide more stable and coherent explanations than others. A user survey provided subjective assessment of explanation quality.
Approach
The study adapts and extends existing explanation techniques (SHAP, GradCAM, self-attention) to analyze deepfake detection models based on EfficientNet. A novel 3D segmentation method for SHAP is introduced to handle video data, and GradCAM is modified for binary classification. The explanations are evaluated using intrinsic metrics and a user survey.
Datasets
Deepfake Detection Challenge (DFDC) dataset
Model(s)
EfficientNet (B4 and B7), EfficientNet with self-attention (LTPA), Bonettini's self-attention model
Author countries
Italy