DeepfakeUCL: Deepfake Detection via Unsupervised Contrastive Learning

View on arXiv ← Back to list

Authors: Sheldon Fung, Xuequan Lu, Chao Zhang, Chang-Tsun Li

Published: 2021-04-23 09:48:10+00:00

AI Summary

This paper proposes DeepfakeUCL, a novel deepfake detection method using unsupervised contrastive learning. It generates two transformed versions of an image, trains an encoder-projection head network to maximize their agreement, and then uses the learned features for efficient linear classification.

Abstract

Face deepfake detection has seen impressive results recently. Nearly all existing deep learning techniques for face deepfake detection are fully supervised and require labels during training. In this paper, we design a novel deepfake detection method via unsupervised contrastive learning. We first generate two different transformed versions of an image and feed them into two sequential sub-networks, i.e., an encoder and a projection head. The unsupervised training is achieved by maximizing the correspondence degree of the outputs of the projection head. To evaluate the detection performance of our unsupervised method, we further use the unsupervised features to train an efficient linear classification network. Extensive experiments show that our unsupervised learning method enables comparable detection performance to state-of-the-art supervised techniques, in both the intra- and inter-dataset settings. We also conduct ablation studies for our method.

Key findings

DeepfakeUCL achieves comparable performance to state-of-the-art supervised methods, even outperforming some in cross-dataset settings. Ablation studies show the effectiveness of unsupervised contrastive learning and the importance of the encoder's output for classification. The method's performance improves with more complex data augmentation.

Approach

DeepfakeUCL uses unsupervised contrastive learning. It trains a network on pairs of augmented versions of the same image to learn features discriminating real from fake images. These features are then used to train a linear classifier for deepfake detection.

Datasets

FaceForensics++, UADFV, Celeb-DF

Model(s)

Xception (as encoder), linear layers (projection head), linear classification network

Author countries

Australia, Japan

← Previous