Text Modality Oriented Image Feature Extraction for Detecting Diffusion-based DeepFake

Authors: Di Yang, Yihao Huang, Qing Guo, Felix Juefei-Xu, Xiaojun Jia, Run Wang, Geguang Pu, Yang Liu

Published: 2024-05-28 11:29:30+00:00

AI Summary

This paper proposes TOFE, a text modality-oriented feature extraction method for detecting diffusion-based deepfakes. TOFE leverages text embeddings iteratively refined to guide the generation of a target image using a text-to-image model, capturing both low-level and high-level image features crucial for distinguishing real and fake images.

Abstract

The widespread use of diffusion methods enables the creation of highly realistic images on demand, thereby posing significant risks to the integrity and safety of online information and highlighting the necessity of DeepFake detection. Our analysis of features extracted by traditional image encoders reveals that both low-level and high-level features offer distinct advantages in identifying DeepFake images produced by various diffusion methods. Inspired by this finding, we aim to develop an effective representation that captures both low-level and high-level features to detect diffusion-based DeepFakes. To address the problem, we propose a text modality-oriented feature extraction method, termed TOFE. Specifically, for a given target image, the representation we discovered is a corresponding text embedding that can guide the generation of the target image with a specific text-to-image model. Experiments conducted across ten diffusion types demonstrate the efficacy of our proposed method.


Key findings
TOFE demonstrates superior performance compared to existing state-of-the-art deepfake detection methods across multiple diffusion models, both in-domain and out-of-domain. The method shows better performance than using only low-level or high-level image features, or a simple concatenation of both. Ablation studies confirm the effectiveness of the hyperparameter choices.
Approach
TOFE refines text embeddings to guide the generation of a target image with a text-to-image model. The refined embeddings, capturing both low and high-level image features, are then used for deepfake detection using a classifier. This approach addresses the limitations of relying solely on low or high-level image features.
Datasets
DIRE dataset, containing images generated by ten diffusion methods (ADM, DALLE2, DDPM, IDDPM, IF, LDM, PNDM, SD-V1, SD-V2, VQ-Diffusion), plus real images.
Model(s)
Stable Diffusion v2.0 (or v1.4 in ablation study) as the text-to-image model; a simple MLP classifier is used for deepfake classification.
Author countries
China, Singapore, USA