DeepFakes Evolution: Analysis of Facial Regions and Fake Detection Performance

Authors: Ruben Tolosana, Sergio Romero-Tapiador, Julian Fierrez, Ruben Vera-Rodriguez

Published: 2020-04-16 08:49:32+00:00

AI Summary

This research analyzes the evolution of DeepFakes across generations, comparing their detection performance using two approaches: analyzing the entire face and analyzing specific facial regions. The study highlights the significantly poorer performance of state-of-the-art detectors on newer DeepFake datasets, indicating a need for more sophisticated detection methods.

Abstract

Media forensics has attracted a lot of attention in the last years in part due to the increasing concerns around DeepFakes. Since the initial DeepFake databases from the 1st generation such as UADFV and FaceForensics++ up to the latest databases of the 2nd generation such as Celeb-DF and DFDC, many visual improvements have been carried out, making fake videos almost indistinguishable to the human eye. This study provides an exhaustive analysis of both 1st and 2nd DeepFake generations in terms of facial regions and fake detection performance. Two different methods are considered in our experimental framework: i) the traditional one followed in the literature and based on selecting the entire face as input to the fake detection system, and ii) a novel approach based on the selection of specific facial regions as input to the fake detection system. Among all the findings resulting from our experiments, we highlight the poor fake detection results achieved even by the strongest state-of-the-art fake detectors in the latest DeepFake databases of the 2nd generation, with Equal Error Rate results ranging from 15% to 30%. These results remark the necessity of further research to develop more sophisticated fake detectors.


Key findings
State-of-the-art DeepFake detectors show significantly lower accuracy (EER of 15-30%) on 2nd generation DeepFake datasets compared to 1st generation datasets. The 'eyes' region consistently provided the best results for detection across datasets, while the 'rest' region performed the worst. These findings highlight the need for more robust detection methods.
Approach
The researchers compared two methods for DeepFake detection: using the entire face as input and using specific facial regions (eyes, nose, mouth, rest) as input to state-of-the-art models (Xception and Capsule Network). They evaluated performance on 1st and 2nd generation DeepFake datasets to analyze the impact of DeepFake improvements on detection accuracy.
Datasets
UADFV, FaceForensics++, Celeb-DF, DFDC Preview
Model(s)
Xception, Capsule Network
Author countries
Spain