Turn Fake into Real: Adversarial Head Turn Attacks Against Deepfake Detection

Authors: Weijie Wang, Zhengyu Zhao, Nicu Sebe, Bruno Lepri

Published: 2023-09-03 07:01:34+00:00

AI Summary

This paper introduces AdvHeat, a novel 3D adversarial attack against deepfake detectors. Unlike previous 2D attacks, AdvHeat synthesizes realistic adversarial face views from a single fake image, demonstrating the vulnerability of deepfake detectors to realistic head-turn manipulations.

Abstract

Malicious use of deepfakes leads to serious public concerns and reduces people's trust in digital media. Although effective deepfake detectors have been proposed, they are substantially vulnerable to adversarial attacks. To evaluate the detector's robustness, recent studies have explored various attacks. However, all existing attacks are limited to 2D image perturbations, which are hard to translate into real-world facial changes. In this paper, we propose adversarial head turn (AdvHeat), the first attempt at 3D adversarial face views against deepfake detectors, based on face view synthesis from a single-view fake image. Extensive experiments validate the vulnerability of various detectors to AdvHeat in realistic, black-box scenarios. For example, AdvHeat based on a simple random search yields a high attack success rate of 96.8% with 360 searching steps. When additional query access is allowed, we can further reduce the step budget to 50. Additional analyses demonstrate that AdvHeat is better than conventional attacks on both the cross-detector transferability and robustness to defenses. The adversarial images generated by AdvHeat are also shown to have natural looks. Our code, including that for generating a multi-view dataset consisting of 360 synthetic views for each of 1000 IDs from FaceForensics++, is available at https://github.com/twowwj/AdvHeaT.


Key findings
AdvHeat achieves high attack success rates (up to 96.8%) against various deepfake detectors in black-box settings. The attack demonstrates high cross-detector transferability and robustness to defenses. Generated adversarial images maintain a natural appearance.
Approach
AdvHeat synthesizes multiple 3D face views from a single fake image using Eg3D, a 3D-aware GAN. It then employs either random search or a gradient-based approach (with query access) to find views that cause the deepfake detector to misclassify the fake image as real.
Datasets
FaceForensics++
Model(s)
ResNet50, Xception, EfficientNet, Meso4, Meso4-Inc, GramNet, F3Net, MAT, ViT, M2TR; GFP-GAN-v3 (for image enhancement); Eg3D (for 3D view synthesis); PTI (for latent space projection)
Author countries
Italy, China