Masked Conditional Diffusion Model for Enhancing Deepfake Detection

Authors: Tiewen Chen, Shanmin Yang, Shu Hu, Zhenghan Fang, Ying Fu, Xi Wu, Xin Wang

Published: 2024-02-01 12:06:55+00:00

AI Summary

This paper proposes a Masked Conditional Diffusion Model (MCDM) for enhancing deepfake detection by generating diverse and high-quality forged faces from masked pristine ones. This data augmentation strategy improves the robustness and generalization ability of deepfake detection models, mitigating the performance degradation seen when testing on unseen datasets.

Abstract

Recent studies on deepfake detection have achieved promising results when training and testing faces are from the same dataset. However, their results severely degrade when confronted with forged samples that the model has not yet seen during training. In this paper, deepfake data to help detect deepfakes. this paper present we put a new insight into diffusion model-based data augmentation, and propose a Masked Conditional Diffusion Model (MCDM) for enhancing deepfake detection. It generates a variety of forged faces from a masked pristine one, encouraging the deepfake detection model to learn generic and robust representations without overfitting to special artifacts. Extensive experiments demonstrate that forgery images generated with our method are of high quality and helpful to improve the performance of deepfake detection models.


Key findings
The MCDM generates high-quality deepfake images, outperforming other diffusion models in terms of FID. The augmented dataset significantly improves deepfake detection performance, particularly in cross-dataset scenarios, achieving higher AUC scores on unseen datasets. Grad-CAM visualizations show the model focuses on forged facial borders, enhancing detection accuracy.
Approach
The authors propose a masked conditional diffusion model that generates various forged faces from partially masked pristine images. This approach focuses on generating realistic forgeries by preserving original image features outside the masked region, improving deepfake detection model training.
Datasets
FaceForensics++ (FF++), Celeb-DFv2 (CDF), DeepFakeDetection (DFD)
Model(s)
Palette (pre-trained on CelebA-HQ), EfficientNet-B4 (pre-trained on ImageNet), U-Net architecture for MCDM
Author countries
China, USA