Do DeepFake Attribution Models Generalize?

Authors: Spiros Baxavanakis, Manos Schinas, Symeon Papadopoulos

Published: 2025-05-22 13:49:05+00:00

AI Summary

This research investigates the generalization ability of DeepFake attribution models, comparing them to binary detection models across six datasets. The study finds that while binary models generalize better, larger models, contrastive learning methods, and higher data quality improve attribution model performance.

Abstract

Recent advancements in DeepFake generation, along with the proliferation of open-source tools, have significantly lowered the barrier for creating synthetic media. This trend poses a serious threat to the integrity and authenticity of online information, undermining public trust in institutions and media. State-of-the-art research on DeepFake detection has primarily focused on binary detection models. A key limitation of these models is that they treat all manipulation techniques as equivalent, despite the fact that different methods introduce distinct artifacts and visual cues. Only a limited number of studies explore DeepFake attribution models, although such models are crucial in practical settings. By providing the specific manipulation method employed, these models could enhance both the perceived trustworthiness and explainability for end users. In this work, we leverage five state-of-the-art backbone models and conduct extensive experiments across six DeepFake datasets. First, we compare binary and multi-class models in terms of cross-dataset generalization. Second, we examine the accuracy of attribution models in detecting seen manipulation methods in unknown datasets, hence uncovering data distribution shifts on the same DeepFake manipulations. Last, we assess the effectiveness of contrastive methods in improving cross-dataset generalization performance. Our findings indicate that while binary models demonstrate better generalization abilities, larger models, contrastive methods, and higher data quality can lead to performance improvements in attribution models. The code of this work is available on GitHub.


Key findings
Binary DeepFake detection models showed better cross-dataset generalization than multi-class attribution models. Attribution models struggled to maintain accuracy on seen manipulations from unseen datasets, highlighting distribution shifts. Contrastive learning methods offered limited performance gains for smaller networks but significantly improved generalization for larger models.
Approach
The researchers trained binary and multi-class (attribution) models on several DeepFake datasets using five state-of-the-art backbone architectures. They compared cross-dataset generalization, assessed the accuracy of attributing seen manipulation methods in unseen datasets, and evaluated the impact of contrastive learning methods.
Datasets
FaceForensics++, CelebDF-V2, FakeAVCeleb, DFDC, ForgeryNet, DFPlatter
Model(s)
EfficientNetV2, ConvNextV2, PyramidNetV2, SwinV2, Efficient ViT
Author countries
Greece