DeepFake Detection with Inconsistent Head Poses: Reproducibility and Analysis

Authors: Kevin Lutz, Robert Bassett

Published: 2021-08-28 22:56:09+00:00

AI Summary

This paper analyzes a DeepFake detection method based on head pose inconsistencies. The authors conduct a reproducibility study and find that the method's effectiveness is significantly overstated in existing literature due to flawed assumptions and algorithmic bias. Their findings correct the current perception of state-of-the-art performance for DeepFake detection.

Abstract

Applications of deep learning to synthetic media generation allow the creation of convincing forgeries, called DeepFakes, with limited technical expertise. DeepFake detection is an increasingly active research area. In this paper, we analyze an existing DeepFake detection technique based on head pose estimation, which can be applied when fake images are generated with an autoencoder-based face swap. Existing literature suggests that this method is an effective DeepFake detector, and its motivating principles are attractively simple. With an eye towards using these principles to develop new DeepFake detectors, we conduct a reproducibility study of the existing method. We conclude that its merits are dramatically overstated, despite its celebrated status. By investigating this discrepancy we uncover a number of important and generalizable insights related to facial landmark detection, identity-agnostic head pose estimation, and algorithmic bias in DeepFake detectors. Our results correct the current literature's perception of state of the art performance for DeepFake detection.


Key findings
The analyzed DeepFake detection method's high accuracy claims are not reproducible. The method suffers from algorithmic bias, relying on the identity of subjects in the training data rather than genuine DeepFake characteristics. Inconsistent head pose estimation, due to optimization challenges, further compromises its effectiveness.
Approach
The paper analyzes an existing DeepFake detection technique that uses head pose estimation from facial landmarks to identify inconsistencies between the inner and outer regions of a face in manipulated images. The authors conduct a reproducibility study and identify several methodological flaws, including algorithmic bias and inaccurate pose estimation.
Datasets
University of Albany Deep Fake Video (UADFV), DARPA GAN challenge dataset, FaceForensics++ (FF++), Deep Fake Detection Dataset (DFDD), Deep Fake Detection Challenge (DFDC), Celeb-DF
Model(s)
Support Vector Machine (SVM) with radial basis function kernel, an ensemble of gradient-boosted regression trees for facial landmark detection, and a pre-trained landmark estimation model.
Author countries
USA