NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping

Authors: Tianyi Wang, Harry Cheng, Xiao Zhang, Yinglong Wang

Published: 2025-03-24 13:49:39+00:00

AI Summary

This paper proposes NullSwap, a novel proactive defense against Deepfake face swapping that cloaks source image identities under a pure black-box scenario. NullSwap embeds identity-guided perturbations into benign images, preventing face swapping models from extracting correct identity features and thereby nullifying the generation of desired identities. Experiments show NullSwap outperforms existing proactive perturbation methods in both visual quality of perturbed images and effectiveness in fooling identity recognition models and face swapping algorithms.

Abstract

Suffering from performance bottlenecks in passively detecting high-quality Deepfake images due to the advancement of generative models, proactive perturbations offer a promising approach to disabling Deepfake manipulations by inserting signals into benign images. However, existing proactive perturbation approaches remain unsatisfactory in several aspects: 1) visual degradation due to direct element-wise addition; 2) limited effectiveness against face swapping manipulation; 3) unavoidable reliance on white- and grey-box settings to involve generative models during training. In this study, we analyze the essence of Deepfake face swapping and argue the necessity of protecting source identities rather than target images, and we propose NullSwap, a novel proactive defense approach that cloaks source image identities and nullifies face swapping under a pure black-box scenario. We design an Identity Extraction module to obtain facial identity features from the source image, while a Perturbation Block is then devised to generate identity-guided perturbations accordingly. Meanwhile, a Feature Block extracts shallow-level image features, which are then fused with the perturbation in the Cloaking Block for image reconstruction. Furthermore, to ensure adaptability across different identity extractors in face swapping algorithms, we propose Dynamic Loss Weighting to adaptively balance identity losses. Experiments demonstrate the outstanding ability of our approach to fool various identity recognition models, outperforming state-of-the-art proactive perturbations in preventing face swapping models from generating images with correct source identities.


Key findings
NullSwap achieves superior visual quality for perturbed images, with PSNR above 40, SSIM above 0.98, and LPIPS below 0.005, outperforming state-of-the-art methods. It effectively cloaks source identities, reducing average Top-5 and Top-1 face recognition accuracies to 0.711 and 0.565, respectively, and successfully nullifies various face swapping models by achieving very low identity similarities (average around 0.33) in the swapped results compared to clean ones.
Approach
NullSwap cloaks source identities by embedding imperceptible, identity-guided perturbations into input images. It uses an Identity Extraction module, a Perturbation Block to generate the perturbation, and a Feature Block with a Cloaking Block to reconstruct the image with the embedded perturbation. A Dynamic Loss Weighting mechanism is devised to adaptively balance identity losses from multiple face recognition tools, ensuring generalizability in a black-box setting.
Datasets
CelebA-HQ, Labeled Faces in the Wild (LFW)
Model(s)
UNKNOWN
Author countries
Singapore, China