CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition

View on arXiv ← Back to list

Authors: Shanmin Yang, Hui Guo, Shu Hu, Bin Zhu, Ying Fu, Siwei Lyu, Xi Wu, Xin Wang

Published: 2023-09-30 12:30:25+00:00

AI Summary

This paper introduces a Deep Information Decomposition (DID) framework for cross-dataset deepfake detection. DID prioritizes high-level semantic features, decomposing facial features into deepfake-related and irrelevant information, using only the deepfake-related information for classification and improving robustness to unseen forgery methods.

Abstract

Deepfake technology poses a significant threat to security and social trust. Although existing detection methods have shown high performance in identifying forgeries within datasets that use the same deepfake techniques for both training and testing, they suffer from sharp performance degradation when faced with cross-dataset scenarios where unseen deepfake techniques are tested. To address this challenge, we propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF). Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts. Specifically, it adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination. Moreover, it optimizes these two kinds of information to be independent with a de-correlation learning module, thereby enhancing the model's robustness against various irrelevant information changes and generalization ability to unseen forgery methods. Our extensive experimental evaluation and comparison with existing state-of-the-art detection methods validate the effectiveness and superiority of the DID framework on cross-dataset deepfake detection.

Key findings

The DID framework achieves state-of-the-art performance on cross-dataset deepfake detection, outperforming baselines. Ablation studies show the importance of both the attention and decorrelation modules. Visualization confirms the effective separation of deepfake-related and irrelevant information.

Approach

The authors propose a DID framework that decomposes facial features into deepfake-related and irrelevant information using attention networks. A decorrelation learning module ensures independence between these components, enhancing robustness and generalization. A robust deepfake classification module uses a loss function combining binary cross-entropy and an AUC approximation.

Datasets

FaceForensics++ (FF++), Celeb-DF V2, DFFD

Model(s)

EfficientNet v2-L (backbone), Deepfake attention network, Domain attention network, Mutual information estimation network, Deepfake classification module (MLP), Domain classification module (MLP)

Author countries

China, USA

← Previous