FAME: A Lightweight Spatio-Temporal Network for Model Attribution of Face-Swap Deepfakes
Authors: Wasim Ahmad, Yan-Tsung Peng, Yuan-Hao Chang
Published: 2025-06-13 05:47:09+00:00
AI Summary
FAME is a lightweight spatio-temporal network for Deepfake model attribution, a task that determines which generative model created a given Deepfake. It integrates spatial and temporal attention mechanisms to improve attribution accuracy while maintaining computational efficiency, outperforming existing methods on three datasets.
Abstract
The widespread emergence of face-swap Deepfake videos poses growing risks to digital security, privacy, and media integrity, necessitating effective forensic tools for identifying the source of such manipulations. Although most prior research has focused primarily on binary Deepfake detection, the task of model attribution -- determining which generative model produced a given Deepfake -- remains underexplored. In this paper, we introduce FAME (Fake Attribution via Multilevel Embeddings), a lightweight and efficient spatio-temporal framework designed to capture subtle generative artifacts specific to different face-swap models. FAME integrates spatial and temporal attention mechanisms to improve attribution accuracy while remaining computationally efficient. We evaluate our model on three challenging and diverse datasets: Deepfake Detection and Manipulation (DFDM), FaceForensics++, and FakeAVCeleb. Results show that FAME consistently outperforms existing methods in both accuracy and runtime, highlighting its potential for deployment in real-world forensic and information security applications.