NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping

Authors: Tianyi Wang, Harry Cheng, Xiao Zhang, Yinglong Wang

Published: 2025-03-24 13:49:39+00:00

AI Summary

NullSwap is a novel proactive defense against deepfake face swapping that cloaks source image identities by embedding imperceptible perturbations. It operates in a pure black-box setting, avoiding reliance on generative models during training and outperforming existing methods.

Abstract

Suffering from performance bottlenecks in passively detecting high-quality Deepfake images due to the advancement of generative models, proactive perturbations offer a promising approach to disabling Deepfake manipulations by inserting signals into benign images. However, existing proactive perturbation approaches remain unsatisfactory in several aspects: 1) visual degradation due to direct element-wise addition; 2) limited effectiveness against face swapping manipulation; 3) unavoidable reliance on white- and grey-box settings to involve generative models during training. In this study, we analyze the essence of Deepfake face swapping and argue the necessity of protecting source identities rather than target images, and we propose NullSwap, a novel proactive defense approach that cloaks source image identities and nullifies face swapping under a pure black-box scenario. We design an Identity Extraction module to obtain facial identity features from the source image, while a Perturbation Block is then devised to generate identity-guided perturbations accordingly. Meanwhile, a Feature Block extracts shallow-level image features, which are then fused with the perturbation in the Cloaking Block for image reconstruction. Furthermore, to ensure adaptability across different identity extractors in face swapping algorithms, we propose Dynamic Loss Weighting to adaptively balance identity losses. Experiments demonstrate the outstanding ability of our approach to fool various identity recognition models, outperforming state-of-the-art proactive perturbations in preventing face swapping models from generating images with correct source identities.


Key findings
NullSwap significantly reduces the accuracy of identity recognition models on perturbed images and successfully prevents various state-of-the-art face swapping models from generating images with correct source identities. The approach maintains high visual quality in the perturbed images.
Approach
NullSwap inserts identity-guided perturbations into source images to prevent face swapping algorithms from extracting correct identity information. It uses an Identity Extraction module, a Perturbation Block, a Feature Block, and a Cloaking Block for image reconstruction, along with Dynamic Loss Weighting to ensure adaptability across different identity extractors.
Datasets
CelebA-HQ, LFW
Model(s)
ArcFace, FaceNet, VGGFace, SFace (for evaluation); A custom model (NullSwap) is trained for perturbation generation.
Author countries
Singapore, China