When Deepfakes Look Real: Detecting AI-Generated Faces with Unlabeled Data due to Annotation Challenges
Authors: Zhiqiang Yang, Renshuai Tao, Xiaolong Zheng, Guodong Yang, Chunjie Zhang
Published: 2025-08-12 15:37:17+00:00
AI Summary
This paper introduces DPGNet, a novel deepfake detection framework that effectively utilizes unlabeled data to overcome annotation challenges. DPGNet addresses the domain gap between different deepfake generation models and leverages unlabeled images through text-guided cross-domain alignment and curriculum-driven pseudo label generation, outperforming state-of-the-art approaches by 6.3%.
Abstract
Existing deepfake detection methods heavily depend on labeled training data. However, as AI-generated content becomes increasingly realistic, even textbf{human annotators struggle to distinguish} between deepfakes and authentic images. This makes the labeling process both time-consuming and less reliable. Specifically, there is a growing demand for approaches that can effectively utilize large-scale unlabeled data from online social networks. Unlike typical unsupervised learning tasks, where categories are distinct, AI-generated faces closely mimic real image distributions and share strong similarities, causing performance drop in conventional strategies. In this paper, we introduce the Dual-Path Guidance Network (DPGNet), to tackle two key challenges: (1) bridging the domain gap between faces from different generation models, and (2) utilizing unlabeled image samples. The method features two core modules: text-guided cross-domain alignment, which uses learnable prompts to unify visual and textual embeddings into a domain-invariant feature space, and curriculum-driven pseudo label generation, which dynamically exploit more informative unlabeled samples. To prevent catastrophic forgetting, we also facilitate bridging between domains via cross-domain knowledge distillation. Extensive experiments on textbf{11 popular datasets}, show that DPGNet outperforms SoTA approaches by textbf{6.3%}, highlighting its effectiveness in leveraging unlabeled data to address the annotation challenges posed by the increasing realism of deepfakes.