X2-DFD: A framework for eXplainable and eXtendable Deepfake Detection
Authors: Yize Chen, Zhiyuan Yan, Guangliang Cheng, Kangran Zhao, Siwei Lyu, Baoyuan Wu
Published: 2024-10-08 15:28:33+00:00
AI Summary
X2-DFD is a novel framework for deepfake detection that improves both detection accuracy and explainability by enhancing the strengths and supplementing the weaknesses of multimodal large language models (MLLMs). It achieves this through a three-stage process: Model Feature Assessment, Explainable Dataset Construction, and Fine-tuning and Inference.
Abstract
This paper proposes X2-DFD, an eXplainable and eXtendable framework based on multimodal large-language models (MLLMs) for deepfake detection, consisting of three key stages. The first stage, Model Feature Assessment, systematically evaluates the detectability of forgery-related features for the MLLM, generating a prioritized ranking of features based on their intrinsic importance to the model. The second stage, Explainable Dataset Construction, consists of two key modules: Strong Feature Strengthening, which is designed to enhance the model's existing detection and explanation capabilities by reinforcing its well-learned features, and Weak Feature Supplementing, which addresses gaps by integrating specific feature detectors (e.g., low-level artifact analyzers) to compensate for the MLLM's limitations. The third stage, Fine-tuning and Inference, involves fine-tuning the MLLM on the constructed dataset and deploying it for final detection and explanation. By integrating these three stages, our approach enhances the MLLM's strengths while supplementing its weaknesses, ultimately improving both the detectability and explainability. Extensive experiments and ablations, followed by a comprehensive human study, validate the improved performance of our approach compared to the original MLLMs. More encouragingly, our framework is designed to be plug-and-play, allowing it to seamlessly integrate with future more advanced MLLMs and specific feature detectors, leading to continual improvement and extension to face the challenges of rapidly evolving deepfakes.