DeepFakes Evolution: Analysis of Facial Regions and Fake Detection Performance

Authors: Ruben Tolosana, Sergio Romero-Tapiador, Julian Fierrez, Ruben Vera-Rodriguez

Published: 2020-04-16 08:49:32+00:00

Journal Ref: Proc. International Conference on Pattern Recognition Workshops 2021

AI Summary

This study provides an exhaustive analysis of both 1st and 2nd DeepFake generations, evaluating fake detection performance based on input facial regions. It compares traditional entire-face input with a novel approach using specific facial regions (eyes, nose, mouth, rest) and highlights the poor detection rates of state-of-the-art methods on newer, more realistic DeepFakes. The research reveals the necessity for developing more sophisticated fake detectors given the evolving realism of DeepFakes.

Abstract

Media forensics has attracted a lot of attention in the last years in part due to the increasing concerns around DeepFakes. Since the initial DeepFake databases from the 1st generation such as UADFV and FaceForensics++ up to the latest databases of the 2nd generation such as Celeb-DF and DFDC, many visual improvements have been carried out, making fake videos almost indistinguishable to the human eye. This study provides an exhaustive analysis of both 1st and 2nd DeepFake generations in terms of facial regions and fake detection performance. Two different methods are considered in our experimental framework: i) the traditional one followed in the literature and based on selecting the entire face as input to the fake detection system, and ii) a novel approach based on the selection of specific facial regions as input to the fake detection system. Among all the findings resulting from our experiments, we highlight the poor fake detection results achieved even by the strongest state-of-the-art fake detectors in the latest DeepFake databases of the 2nd generation, with Equal Error Rate results ranging from 15% to 30%. These results remark the necessity of further research to develop more sophisticated fake detectors.


Key findings
State-of-the-art detectors achieve significantly poorer performance (EER 15-30%) on 2nd generation DeepFake databases (Celeb-DF, DFDC) compared to 1st generation (EER 1-3%). The 'Eyes' facial region generally proves to be the most discriminative for fake detection, while other regions like the nose, mouth, and the rest of the face show increased realism and thus higher detection errors in 2nd generation DeepFakes.
Approach
The study evaluates deepfake detection by comparing two input methods: using the entire face versus selecting specific facial regions (eyes, nose, mouth, rest) for analysis. Facial regions are segmented using the OpenFace2 toolbox, and state-of-the-art fake detectors (Xception and Capsule Network) are applied to analyze the discriminative power of each region across different DeepFake generations.
Datasets
UADFV, FaceForensics++, Celeb-DF, DFDC Preview
Model(s)
Xception, Capsule Network
Author countries
Spain