One Shot Face Swapping on Megapixels

Authors: Yuhao Zhu, Qi Li, Jian Wang, Chengzhong Xu, Zhenan Sun

Published: 2021-05-11 10:41:47+00:00

AI Summary

This paper presents MegaFS, the first megapixel-level one-shot face swapping method. It achieves this through a hierarchical face representation, a novel face transfer module for synchronized attribute control, and leveraging StyleGAN2 for stable generation, overcoming limitations of previous methods.

Abstract

Face swapping has both positive applications such as entertainment, human-computer interaction, etc., and negative applications such as DeepFake threats to politics, economics, etc. Nevertheless, it is necessary to understand the scheme of advanced methods for high-quality face swapping and generate enough and representative face swapping images to train DeepFake detection algorithms. This paper proposes the first Megapixel level method for one shot Face Swapping (or MegaFS for short). Firstly, MegaFS organizes face representation hierarchically by the proposed Hierarchical Representation Face Encoder (HieRFE) in an extended latent space to maintain more facial details, rather than compressed representation in previous face swapping methods. Secondly, a carefully designed Face Transfer Module (FTM) is proposed to transfer the identity from a source image to the target by a non-linear trajectory without explicit feature disentanglement. Finally, the swapped faces can be synthesized by StyleGAN2 with the benefits of its training stability and powerful generative capability. Each part of MegaFS can be trained separately so the requirement of our model for GPU memory can be satisfied for megapixel face swapping. In summary, complete face representation, stable training, and limited memory usage are the three novel contributions to the success of our method. Extensive experiments demonstrate the superiority of MegaFS and the first megapixel level face swapping database is released for research on DeepFake detection and face image editing in the public domain. The dataset is at this link.


Key findings
MegaFS achieves superior megapixel face swapping results compared to existing methods, as demonstrated on FaceForensics++. The proposed hierarchical representation and FTM contribute significantly to improved identity preservation. A new megapixel-level face swapping database was also released.
Approach
MegaFS uses a three-stage approach: Hierarchical Representation Face Encoder (HieRFE) for complete face representation, Face Transfer Module (FTM) for identity transfer without explicit feature disentanglement, and StyleGAN2 for high-fidelity face generation. Each stage is trained separately to manage GPU memory.
Datasets
CelebA, CelebA-HQ, FFHQ, FaceForensics++
Model(s)
StyleGAN2, ResNet50, ArcFace (for identity loss), a facial landmark predictor (for landmark loss), a perceptual feature extractor (for LPIPS loss)
Author countries
China, Macao