Towards Sustainable Universal Deepfake Detection with Frequency-Domain Masking

Authors: Chandler Timm C. Doloriel, Habib Ullah, Kristian Hovde Liland, Fadi Al Machot, Ngai-Man Cheung

Published: 2025-12-08 21:08:25+00:00

AI Summary

This research introduces frequency-domain masking as a training strategy for universal deepfake detection, aiming to identify AI-generated images across various generative models, including unseen ones. The approach enhances detection accuracy and generalization by focusing on frequency-based features rather than spatial ones, while also maintaining performance under significant model pruning. This offers a scalable and resource-conscious solution aligned with Green AI principles, achieving state-of-the-art generalization on GAN- and diffusion-generated image datasets.

Abstract

Universal deepfake detection aims to identify AI-generated images across a broad range of generative models, including unseen ones. This requires robust generalization to new and unseen deepfakes, which emerge frequently, while minimizing computational overhead to enable large-scale deepfake screening, a critical objective in the era of Green AI. In this work, we explore frequency-domain masking as a training strategy for deepfake detectors. Unlike traditional methods that rely heavily on spatial features or large-scale pretrained models, our approach introduces random masking and geometric transformations, with a focus on frequency masking due to its superior generalization properties. We demonstrate that frequency masking not only enhances detection accuracy across diverse generators but also maintains performance under significant model pruning, offering a scalable and resource-conscious solution. Our method achieves state-of-the-art generalization on GAN- and diffusion-generated image datasets and exhibits consistent robustness under structured pruning. These results highlight the potential of frequency-based masking as a practical step toward sustainable and generalizable deepfake detection. Code and models are available at: [https://github.com/chandlerbing65nm/FakeImageDetection](https://github.com/chandlerbing65nm/FakeImageDetection).


Key findings
Frequency-domain masking significantly improves deepfake detection accuracy and generalization across diverse GAN- and diffusion-generated image datasets, outperforming spatial and geometric augmentations, with an optimal masking ratio of 15%. This training strategy also demonstrates robustness under structured model pruning, preserving performance even with reduced parameters, thus aligning with Green AI objectives. The approach shows strong real-world performance in specialized domains like aquaculture, effectively detecting low-quality synthetic fish images.
Approach
The proposed approach involves applying random masking and geometric transformations during supervised training, with a primary focus on frequency-domain masking. Images are transformed into the frequency domain via Fast Fourier Transform (FFT), where specific frequency bands are randomly masked, and then inverse-transformed back to the spatial domain to train the detector. This compels the model to learn robust, generalizable features by preventing reliance on generator-specific artifacts, while masking is only used during training, not testing.
Datasets
ProGAN (for training/validation), Wang et al. [38] benchmarks (including ProGAN, StyleGAN, BigGAN, GauGAN, CycleGAN, StarGAN, CRN, IMLE, SITD, SAN, DeepFake images), Ojha et al. [30] diffusion model datasets (including Guided Diffusion, Latent Diffusion, Glide, DALL-E-mini), and FakeFish (synthetic fish images generated using ControlNet and Stable Diffusion) from Das et al. [11].
Model(s)
ResNet50, Xception, CLIP-RN50, S-ResNet50, VGG11, MobileNetv2. The frequency masking technique was applied to existing state-of-the-art methods by Wang et al. [38] and Gragnaniello et al. [15], which utilize variations of these backbone architectures (e.g., ResNet500.5, ResNet50nd).
Author countries
Norway, Singapore