Model Attribution of Face-swap Deepfake Videos
Authors: Shan Jia, Xin Li, Siwei Lyu
Published: 2022-02-25 20:05:18+00:00
AI Summary
This paper addresses the problem of Deepfake model attribution, aiming to identify the specific generation model used to create a face-swap video. The authors introduce a new dataset, DFDM, containing Deepfakes from different autoencoder models and propose a spatial and temporal attention-based method (DMA-STA) that achieves over 70% accuracy in identifying the generation model.
Abstract
AI-created face-swap videos, commonly known as Deepfakes, have attracted wide attention as powerful impersonation attacks. Existing research on Deepfakes mostly focuses on binary detection to distinguish between real and fake videos. However, it is also important to determine the specific generation model for a fake video, which can help attribute it to the source for forensic investigation. In this paper, we fill this gap by studying the model attribution problem of Deepfake videos. We first introduce a new dataset with DeepFakes from Different Models (DFDM) based on several Autoencoder models. Specifically, five generation models with variations in encoder, decoder, intermediate layer, input resolution, and compression ratio have been used to generate a total of 6,450 Deepfake videos based on the same input. Then we take Deepfakes model attribution as a multiclass classification task and propose a spatial and temporal attention based method to explore the differences among Deepfakes in the new dataset. Experimental evaluation shows that most existing Deepfakes detection methods failed in Deepfakes model attribution, while the proposed method achieved over 70% accuracy on the high-quality DFDM dataset.