Detecting Deepfakes with Self-Blended Images

Authors: Kaede Shiohara, Toshihiko Yamasaki

Published: 2022-04-18 15:44:35+00:00

AI Summary

This paper introduces self-blended images (SBIs), a novel synthetic training data generation method for deepfake detection. SBIs are created by blending subtly altered versions of a single pristine image, mimicking common forgery artifacts. This approach improves model generalization to unseen manipulations and datasets.

Abstract

In this paper, we present novel synthetic training data called self-blended images (SBIs) to detect deepfakes. SBIs are generated by blending pseudo source and target images from single pristine images, reproducing common forgery artifacts (e.g., blending boundaries and statistical inconsistencies between source and target images). The key idea behind SBIs is that more general and hardly recognizable fake samples encourage classifiers to learn generic and robust representations without overfitting to manipulation-specific artifacts. We compare our approach with state-of-the-art methods on FF++, CDF, DFD, DFDC, DFDCP, and FFIW datasets by following the standard cross-dataset and cross-manipulation protocols. Extensive experiments show that our method improves the model generalization to unknown manipulations and scenes. In particular, on DFDC and DFDCP where existing methods suffer from the domain gap between the training and test sets, our approach outperforms the baseline by 4.90% and 11.78% points in the cross-dataset evaluation, respectively.


Key findings
The proposed SBI method significantly outperforms state-of-the-art methods in cross-dataset and cross-manipulation evaluations. Specifically, it shows substantial improvement on datasets where existing methods struggle due to domain gaps. The approach also generalizes well to different network architectures.
Approach
The authors generate synthetic training data called self-blended images (SBIs) by blending slightly modified versions of a single real image. This introduces common forgery artifacts without relying on pairs of distinct images. A classifier is then trained on these SBIs and real images.
Datasets
FF++, CDF, DFD, DFDC, DFDCP, FFIW, FaceShifter, DeeperForensics-1.0
Model(s)
EfficientNet-b4 (primarily), ResNet-50, ResNet-152, Xception, EfficientNet-b1. The paper also mentions using other models (e.g., Xception) for comparison with baseline methods.
Author countries
Japan