Recent Advances on Generalizable Diffusion-generated Image Detection

Authors: Qijie Xu, Defang Chen, Jiawei Chen, Siwei Lyu, Can Wang

Published: 2025-02-27 03:14:40+00:00

AI Summary

This research paper provides a systematic survey of recent advances in generalizable diffusion-generated image detection. It categorizes existing methods into data-driven and feature-driven approaches, offering a comprehensive taxonomy and identifying key open challenges for future research.

Abstract

The rise of diffusion models has significantly improved the fidelity and diversity of generated images. With numerous benefits, these advancements also introduce new risks. Diffusion models can be exploited to create high-quality Deepfake images, which poses challenges for image authenticity verification. In recent years, research on generalizable diffusion-generated image detection has grown rapidly. However, a comprehensive review of this topic is still lacking. To bridge this gap, we present a systematic survey of recent advances and classify them into two main categories: (1) data-driven detection and (2) feature-driven detection. Existing detection methods are further classified into six fine-grained categories based on their underlying principles. Finally, we identify several open challenges and envision some future directions, with the hope of inspiring more research work on this important topic. Reviewed works in this survey can be found at https://github.com/zju-pi/Awesome-Diffusion-generated-Image-Detection.


Key findings
The survey reveals a rapid increase in research on diffusion-generated image detection. It highlights the need for improved robustness to post-processing, stronger theoretical foundations, and higher-quality datasets. The paper also proposes an alternative paradigm for generalizable detection using specialized models for different generative model categories.
Approach
The paper surveys existing image deepfake detection methods, classifying them into two main categories: data-driven detection (refining training strategies) and feature-driven detection (analyzing differences in specific feature spaces). It further sub-categorizes these approaches based on their underlying principles.
Datasets
GenImage, DiffusionForensics, and others mentioned throughout the paper but not explicitly listed.
Model(s)
Various models/architectures are mentioned, including Vision Transformer (ViT), CLIP-ViT, and Stable Diffusion, but no single model is the primary contribution of the paper itself.
Author countries
China, USA