DeepFake MNIST+: A DeepFake Facial Animation Dataset

Authors: Jiajun Huang, Xueyu Wang, Bo Du, Pei Du, Chang Xu

Published: 2021-08-18 02:37:17+00:00

AI Summary

This paper introduces DeepFake MNIST+, a new dataset of 10,000 facial animation videos showcasing ten different actions, designed to challenge existing liveness detectors. A baseline detection method is also presented and analyzed, revealing challenges in detecting animations under varying motion and compression.

Abstract

The DeepFakes, which are the facial manipulation techniques, is the emerging threat to digital society. Various DeepFake detection methods and datasets are proposed for detecting such data, especially for face-swapping. However, recent researches less consider facial animation, which is also important in the DeepFake attack side. It tries to animate a face image with actions provided by a driving video, which also leads to a concern about the security of recent payment systems that reply on liveness detection to authenticate real users via recognising a sequence of user facial actions. However, our experiments show that the existed datasets are not sufficient to develop reliable detection methods. While the current liveness detector cannot defend such videos as the attack. As a response, we propose a new human face animation dataset, called DeepFake MNIST+, generated by a SOTA image animation generator. It includes 10,000 facial animation videos in ten different actions, which can spoof the recent liveness detectors. A baseline detection method and a comprehensive analysis of the method is also included in this paper. In addition, we analyze the proposed dataset's properties and reveal the difficulty and importance of detecting animation datasets under different types of motion and compression quality.


Key findings
ResNet models showed the best performance in deepfake detection, achieving over 96% accuracy on raw videos. Video compression significantly impacted detection accuracy, with heavier compression leading to lower performance. Actions with large movements were easier to detect than subtle actions.
Approach
The authors generated a new dataset, DeepFake MNIST+, using a state-of-the-art image animation generator. They then used this dataset to train and evaluate several deep learning models for deepfake detection, focusing on the impact of compression and different types of facial actions.
Datasets
DeepFake MNIST+ (generated by authors), VoxCeleb1, ADFES, FF++, Celeb-DF, DFDC
Model(s)
MesoInception-4, XceptionNet, Resnet50, Resnet101, Resnet152
Author countries
Australia, China