Synthesizing Black-box Anti-forensics DeepFakes with High Visual Quality

View on arXiv ← Back to list

Authors: Bing Fan, Shu Hu, Feng Ding

Published: 2023-12-17 13:12:34+00:00

AI Summary

This paper proposes a novel method for generating high-quality, undetectable deepfakes through adversarial sharpening masks. The method uses a two-stage network to create adversarial perturbations that enhance visual quality while simultaneously disrupting state-of-the-art deepfake detectors.

Abstract

DeepFake, an AI technology for creating facial forgeries, has garnered global attention. Amid such circumstances, forensics researchers focus on developing defensive algorithms to counter these threats. In contrast, there are techniques developed for enhancing the aggressiveness of DeepFake, e.g., through anti-forensics attacks, to disrupt forensic detectors. However, such attacks often sacrifice image visual quality for improved undetectability. To address this issue, we propose a method to generate novel adversarial sharpening masks for launching black-box anti-forensics attacks. Unlike many existing arts, with such perturbations injected, DeepFakes could achieve high anti-forensics performance while exhibiting pleasant sharpening visual effects. After experimental evaluations, we prove that the proposed method could successfully disrupt the state-of-the-art DeepFake detectors. Besides, compared with the images processed by existing DeepFake anti-forensics methods, the visual qualities of anti-forensics DeepFakes rendered by the proposed method are significantly refined.

Key findings

The proposed method successfully disrupted state-of-the-art deepfake detectors while significantly improving the visual quality compared to existing anti-forensics methods. The generated deepfakes were visually appealing, lacking the artifacts common in previous approaches. The results were evaluated using PSNR, SSIM, and face detection metrics.

Approach

The approach uses a two-stage network: a Forensics Disruption Network (FDN) to generate undetectable images and a Visual Enhancement Network (VEN) to refine visual quality by sharpening the images. The VEN uses MobileVit blocks and a parameter-frozen training strategy to improve the visual quality without compromising undetectability.

Datasets

Celeb-DF, FaceForensics++, DeeperForensics

Model(s)

ResNet-50, DenseNet-121, EfficientNet, MobileNet, ShuffleNet, ConvNeXt, EfficientNet-SBIs, and a custom two-stage GAN architecture (FDN and VEN) incorporating MobileVit blocks.

Author countries

China, USA

← Previous