DiffusionFF: A Diffusion-based Framework for Joint Face Forgery Detection and Fine-Grained Artifact Localization
Authors: Siran Peng, Haoyuan Zhang, Li Gao, Tianshuo Zhang, Xiangyu Zhu, Bao Li, Weisong Zhao, Zhen Lei
Published: 2025-08-03 18:06:04+00:00
AI Summary
DiffusionFF is a novel diffusion-based framework for joint face forgery detection and fine-grained artifact localization. It utilizes a pretrained forgery detector as an artifact encoder and repurposes a denoising diffusion model as an artifact decoder, conditioned on multi-scale forgery-related features. By fusing the progressively synthesized artifact localization map with high-level semantic features, DiffusionFF significantly improves detection capability.
Abstract
The rapid evolution of deepfake technologies demands robust and reliable face forgery detection algorithms. While determining whether an image has been manipulated remains essential, the ability to precisely localize forgery clues is also important for enhancing model explainability and building user trust. To address this dual challenge, we introduce DiffusionFF, a diffusion-based framework that simultaneously performs face forgery detection and fine-grained artifact localization. Our key idea is to establish a novel encoder-decoder architecture: a pretrained forgery detector serves as a powerful artifact encoder, and a denoising diffusion model is repurposed as an artifact decoder. Conditioned on multi-scale forgery-related features extracted by the encoder, the decoder progressively synthesizes a detailed artifact localization map. We then fuse this fine-grained localization map with high-level semantic features from the forgery detector, leading to substantial improvements in detection capability. Extensive experiments show that DiffusionFF achieves state-of-the-art (SOTA) performance across multiple benchmarks, underscoring its superior effectiveness and explainability.