Anisotropic multiresolution analyses for deepfake detection

View on arXiv ← Back to list

Authors: Wei Huang, Michelangelo Valsecchi, Michael Multerer

Published: 2022-10-26 17:26:09+00:00

AI Summary

This paper proposes using anisotropic multiresolution analyses (fully separable wavelet transform and samplets) for deepfake detection. It argues that GANs, using primarily isotropic convolutions, leave detectable traces in the coefficient distribution of anisotropic transformations, which can be used to improve state-of-the-art deepfake detection accuracy.

Abstract

Generative Adversarial Networks (GANs) have paved the path towards entirely new media generation capabilities at the forefront of image, video, and audio synthesis. However, they can also be misused and abused to fabricate elaborate lies, capable of stirring up the public debate. The threat posed by GANs has sparked the need to discern between genuine content and fabricated one. Previous studies have tackled this task by using classical machine learning techniques, such as k-nearest neighbours and eigenfaces, which unfortunately did not prove very effective. Subsequent methods have focused on leveraging on frequency decompositions, i.e., discrete cosine transform, wavelets, and wavelet packets, to preprocess the input features for classifiers. However, existing approaches only rely on isotropic transformations. We argue that, since GANs primarily utilize isotropic convolutions to generate their output, they leave clear traces, their fingerprint, in the coefficient distribution on sub-bands extracted by anisotropic transformations. We employ the fully separable wavelet transform and multiwavelets to obtain the anisotropic features to feed to standard CNN classifiers. Lastly, we find the fully separable transform capable of improving the state-of-the-art.

Key findings

Anisotropic transforms (fully separable wavelet transform and samplets) significantly improved deepfake detection accuracy compared to state-of-the-art methods on CelebA and LSUN bedroom datasets. The fully separable transform with reflect-padding consistently performed best across datasets, even outperforming existing methods on the FFHQ dataset. Anisotropic features also showed greater robustness to common image perturbations and better performance with limited training data.

Approach

The authors leverage the fully separable wavelet transform and samplets to extract anisotropic features from images. These features, exhibiting mismatches between real and fake images, are then fed into standard CNN classifiers for deepfake detection.

Datasets

CelebA, LSUN bedrooms, and Flickr-Faces-HQ (FFHQ). Datasets were generated using CramerGAN, MMDGAN, ProGAN, SN-DCGAN, StyleGAN, StyleGAN2, and StyleGAN3.

Model(s)

A lightweight multi-class CNN classifier, similar to those used in prior work.

Author countries

Switzerland

← Previous