Enhancing Abnormality Identification: Robust Out-of-Distribution Strategies for Deepfake Detection

Authors: Luca Maiano, Fabrizio Casadei, Irene Amerini

Published: 2025-06-03 13:24:33+00:00

AI Summary

This paper proposes two novel Out-Of-Distribution (OOD) detection strategies to enhance robust deepfake detection, addressing the challenge of generalizing to open-set scenarios with continuously evolving generative models. The first approach reconstructs the input image, while the second incorporates an attention mechanism for OOD identification. These methods achieve promising results in deepfake detection and rank among the top-performing configurations on the benchmark.

Abstract

Detecting deepfakes has become a critical challenge in Computer Vision and Artificial Intelligence. Despite significant progress in detection techniques, generalizing them to open-set scenarios continues to be a persistent difficulty. Neural networks are often trained on the closed-world assumption, but with new generative models constantly evolving, it is inevitable to encounter data generated by models that are not part of the training distribution. To address these challenges, in this paper, we propose two novel Out-Of-Distribution (OOD) detection approaches. The first approach is trained to reconstruct the input image, while the second incorporates an attention mechanism for detecting OODs. Our experiments validate the effectiveness of the proposed approaches compared to existing state-of-the-art techniques. Our method achieves promising results in deepfake detection and ranks among the top-performing configurations on the benchmark, demonstrating their potential for robust, adaptable solutions in dynamic, real-world applications.


Key findings
Both CNN-based and Transformer-based approaches demonstrated promising performance, with the Transformer-based solution (Tiny DeiT) being computationally efficient and achieving faster convergence. It showed better OOD detection in the 'Content scenario' and when trained solely on synthetic OOD data compared to the CNN approach. The proposed Abnormality modules, particularly V2 and V3, proved highly effective, yielding results comparable to state-of-the-art on the CIFAR10 OOD benchmark when real OOD outliers were used.
Approach
The authors propose a two-module pipeline consisting of an In-Distribution (ID) module for deepfake classification and an Abnormality module for OOD detection. The ID module has CNN-based (U-Net Scorer) and Transformer-based (DeiT) variants; the U-Net reconstructs the input image directly, while the DeiT approach reconstructs attention maps using an Autoencoder/Variational Autoencoder. The Abnormality module then combines the ID module's Softmax probabilities, latent encoding, and reconstruction residual (from either image or attention map) to classify samples as ID or OOD, utilizing three architectural variants and different training strategies for OOD data.
Datasets
CDDB (Continual Deepfake Detection Benchmark) Dataset, CIFAR10 OOD benchmark
Model(s)
U-Net Scorer, Data-efficient Image Transformer (DeiT), Autoencoder (AE), Variational Autoencoder (VAE), Multilayer Perceptron (MLP)
Author countries
Italy