A Quality-Centric Framework for Generic Deepfake Detection

View on arXiv ← Back to list

Authors: Wentang Song, Zhiyuan Yan, Yuzhen Lin, Taiping Yao, Changsheng Chen, Shen Chen, Yandan Zhao, Shouhong Ding, Bin Li

Published: 2024-11-08 05:14:46+00:00

AI Summary

This paper introduces a quality-centric framework for deepfake detection that improves generalization by focusing on the forgery quality of training data. The framework incorporates a quality evaluator, a low-quality data enhancement module (FreDA), and a learning pacing strategy to guide the model from easy to hard examples, enhancing generalization performance.

Abstract

Detecting AI-generated images, particularly deepfakes, has become increasingly crucial, with the primary challenge being the generalization to previously unseen manipulation methods. This paper tackles this issue by leveraging the forgery quality of training data to improve the generalization performance of existing deepfake detectors. Generally, the forgery quality of different deepfakes varies: some have easily recognizable forgery clues, while others are highly realistic. Existing works often train detectors on a mix of deepfakes with varying forgery qualities, potentially leading detectors to short-cut the easy-to-spot artifacts from low-quality forgery samples, thereby hurting generalization performance. To tackle this issue, we propose a novel quality-centric framework for generic deepfake detection, which is composed of a Quality Evaluator, a low-quality data enhancement module, and a learning pacing strategy that explicitly incorporates forgery quality into the training process. Our framework is inspired by curriculum learning, which is designed to gradually enable the detector to learn more challenging deepfake samples, starting with easier samples and progressing to more realistic ones. We employ both static and dynamic assessments to assess the forgery quality, combining their scores to produce a final rating for each training sample. The rating score guides the selection of deepfake samples for training, with higher-rated samples having a higher probability of being chosen. Furthermore, we propose a novel frequency data augmentation method specifically designed for low-quality forgery samples, which helps to reduce obvious forgery traces and improve their overall realism. Extensive experiments demonstrate that our proposed framework can be applied plug-and-play to existing detection models and significantly enhance their generalization performance in detection.

Key findings

The proposed framework significantly improves the generalization performance of deepfake detectors across various datasets, outperforming state-of-the-art methods. The ablation studies demonstrate the effectiveness of each component, particularly FreDA in enhancing low-quality samples and the learning pacing strategy in guiding the training process. The method shows improvement in cross-manipulation scenarios.

Approach

The authors propose a three-module framework: a Quality Evaluator assigning a Forgery Quality Score (FQS) based on static and dynamic assessments; FreDA, a frequency-domain data augmentation technique for low-quality samples; and a learning pacing strategy prioritizing high-FQS samples during training, gradually shifting focus to harder examples.

Datasets

FaceForensics++ (FF++), Celeb-DFv2 (CDF), Deepfake Detection Challenge Preview (DFDC-p), Deepfake Detection Challenge Public Test Set (DFDC), WildDeepfake (Wild), DF40

Model(s)

Swin-Transformer-V2 Base (Swin-V2), ArcFace (for quality evaluation)

Author countries

China

← Previous