Improving the Efficiency and Robustness of Deepfakes Detection through Precise Geometric Features

Authors: Zekun Sun, Yujie Han, Zeyu Hua, Na Ruan, Weijia Jia

Published: 2021-04-09 16:57:55+00:00

AI Summary

This paper proposes LRNet, an efficient and robust deepfake detection framework that leverages temporal modeling of precise geometric features. LRNet uses a calibration module to enhance the precision of geometric features and a two-stream RNN to effectively exploit temporal features, resulting in a lightweight and easily trainable model.

Abstract

Deepfakes is a branch of malicious techniques that transplant a target face to the original one in videos, resulting in serious problems such as infringement of copyright, confusion of information, or even public panic. Previous efforts for Deepfakes videos detection mainly focused on appearance features, which have a risk of being bypassed by sophisticated manipulation, also resulting in high model complexity and sensitiveness to noise. Besides, how to mine the temporal features of manipulated videos and exploit them is still an open question. We propose an efficient and robust framework named LRNet for detecting Deepfakes videos through temporal modeling on precise geometric features. A novel calibration module is devised to enhance the precision of geometric features, making it more discriminative, and a two-stream Recurrent Neural Network (RNN) is constructed for sufficient exploitation of temporal features. Compared to previous methods, our proposed method is lighter-weighted and easier to train. Moreover, our method has shown robustness in detecting highly compressed or noise corrupted videos. Our model achieved 0.999 AUC on FaceForensics++ dataset. Meanwhile, it has a graceful decline in performance (-0.042 AUC) when faced with highly compressed videos.


Key findings
LRNet achieved 0.999 AUC on the FaceForensics++ dataset and demonstrated robustness against video compression and noise. The calibration module significantly improved detection accuracy and robustness, while the two-stream RNN architecture effectively captured temporal inconsistencies.
Approach
LRNet detects deepfakes by analyzing temporal patterns in precise geometric facial features. It employs a calibration module to refine landmark detection, followed by a two-stream RNN that models both facial shape movement and landmark positional differences over time.
Datasets
UADFV, FaceForensics++, Celeb-DF, DeeperForensics-1.0
Model(s)
Two-stream Recurrent Neural Network (RNN) with GRU units; Dlib and OpenFace used for landmark detection.
Author countries
China