UCF: Uncovering Common Features for Generalizable Deepfake Detection

View on arXiv ← Back to list

Authors: Zhiyuan Yan, Yong Zhang, Yanbo Fan, Baoyuan Wu

Published: 2023-04-27 04:07:29+00:00

AI Summary

This paper introduces a novel multi-task disentanglement framework for deepfake detection that improves generalization by identifying common forgery features across different methods. The framework disentangles image information into forgery-irrelevant, method-specific, and common forgery features, using only the latter for detection, leading to superior performance compared to state-of-the-art methods.

Abstract

Deepfake detection remains a challenging task due to the difficulty of generalizing to new types of forgeries. This problem primarily stems from the overfitting of existing detection methods to forgery-irrelevant features and method-specific patterns. The latter has been rarely studied and not well addressed by previous works. This paper presents a novel approach to address the two types of overfitting issues by uncovering common forgery features. Specifically, we first propose a disentanglement framework that decomposes image information into three distinct components: forgery-irrelevant, method-specific forgery, and common forgery features. To ensure the decoupling of method-specific and common forgery features, a multi-task learning strategy is employed, including a multi-class classification that predicts the category of the forgery method and a binary classification that distinguishes the real from the fake. Additionally, a conditional decoder is designed to utilize forgery features as a condition along with forgery-irrelevant features to generate reconstructed images. Furthermore, a contrastive regularization technique is proposed to encourage the disentanglement of the common and specific forgery features. Ultimately, we only utilize the common forgery features for the purpose of generalizable deepfake detection. Extensive evaluations demonstrate that our framework can perform superior generalization than current state-of-the-art methods.

Key findings

The proposed framework outperforms existing state-of-the-art methods in generalization across unseen datasets. Ablation studies confirm the effectiveness of the disentanglement framework, multi-task learning, contrastive regularization, and conditional decoder in improving generalization. The use of common forgery features significantly improves performance compared to using specific or whole forgery features.

Approach

The approach uses a multi-task learning framework with an encoder to disentangle image features into content, specific forgery, and common forgery components. A conditional decoder reconstructs images using these components, and a contrastive regularization loss further enhances disentanglement. Only the common forgery features are used for deepfake classification.

Datasets

FaceForensics++ (FF++) (raw, HQ, LQ versions), DeepfakeDetection (DFD), Deepfake Detection Challenge (DFDC), CelebDF

Model(s)

Modified Xception (primarily), also mentions using ConvNext, ResNet, and EfficientNet in supplementary materials.

Author countries

China

← Previous