Improving the Efficiency and Robustness of Deepfakes Detection through Precise Geometric Features
Authors: Zekun Sun, Yujie Han, Zeyu Hua, Na Ruan, Weijia Jia
Published: 2021-04-09 16:57:55+00:00
Comment: IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021 (CVPR 2021)
AI Summary
This paper introduces LRNet, an efficient and robust framework for detecting Deepfakes videos by leveraging temporal modeling on precise geometric features. It proposes a novel calibration module to enhance the precision of facial landmarks and employs a two-stream Recurrent Neural Network (RNN) to effectively exploit temporal features. The method aims to overcome the limitations of previous appearance-based techniques, such as high model complexity and sensitivity to noise, by focusing on inherent temporal artifacts in manipulated faces.
Abstract
Deepfakes is a branch of malicious techniques that transplant a target face to the original one in videos, resulting in serious problems such as infringement of copyright, confusion of information, or even public panic. Previous efforts for Deepfakes videos detection mainly focused on appearance features, which have a risk of being bypassed by sophisticated manipulation, also resulting in high model complexity and sensitiveness to noise. Besides, how to mine the temporal features of manipulated videos and exploit them is still an open question. We propose an efficient and robust framework named LRNet for detecting Deepfakes videos through temporal modeling on precise geometric features. A novel calibration module is devised to enhance the precision of geometric features, making it more discriminative, and a two-stream Recurrent Neural Network (RNN) is constructed for sufficient exploitation of temporal features. Compared to previous methods, our proposed method is lighter-weighted and easier to train. Moreover, our method has shown robustness in detecting highly compressed or noise corrupted videos. Our model achieved 0.999 AUC on FaceForensics++ dataset. Meanwhile, it has a graceful decline in performance (-0.042 AUC) when faced with highly compressed videos.