A Continual Deepfake Detection Benchmark: Dataset, Methods, and Essentials

Authors: Chuqiao Li, Zhiwu Huang, Danda Pani Paudel, Yabin Wang, Mohamad Shahbazi, Xiaopeng Hong, Luc Van Gool

Published: 2022-05-11 13:07:19+00:00

AI Summary

This paper introduces a continual deepfake detection benchmark (CDDB) using a new dataset of deepfakes from known and unknown generative models. The benchmark evaluates methods' ability to incrementally learn deepfake detection without catastrophic forgetting across easy, hard, and long sequences of tasks, providing insights into continual learning essentials for deepfake detection.

Abstract

There have been emerging a number of benchmarks and techniques for the detection of deepfakes. However, very few works study the detection of incrementally appearing deepfakes in the real-world scenarios. To simulate the wild scenes, this paper suggests a continual deepfake detection benchmark (CDDB) over a new collection of deepfakes from both known and unknown generative models. The suggested CDDB designs multiple evaluations on the detection over easy, hard, and long sequence of deepfake tasks, with a set of appropriate measures. In addition, we exploit multiple approaches to adapt multiclass incremental learning methods, commonly used in the continual visual recognition, to the continual deepfake detection problem. We evaluate existing methods, including their adapted ones, on the proposed CDDB. Within the proposed benchmark, we explore some commonly known essentials of standard continual learning. Our study provides new insights on these essentials in the context of continual deepfake detection. The suggested CDDB is clearly more challenging than the existing benchmarks, which thus offers a suitable evaluation avenue to the future research. Both data and code are available at https://github.com/Coral79/CDDB.


Key findings
The proposed CDDB is significantly more challenging than existing benchmarks. Multi-task learning adaptations generally outperformed binary and multi-class adaptations. The memory budget was identified as a crucial factor in continual deepfake detection performance.
Approach
The authors adapt multiclass incremental learning methods to the binary deepfake detection problem. They evaluate existing methods and their adaptations on the proposed CDDB, assessing performance across easy, hard, and long sequences of deepfake tasks using average accuracy, average forgetting degree, and mean average precision.
Datasets
A new dataset (CDDB) combining publicly available deepfakes from known (e.g., ProGAN, StyleGAN, BigGAN) and unknown generative models (e.g., WildDeepfake, WhichFaceReal), along with datasets like FaceForensics++, and others.
Model(s)
ResNet-50 (CNNDet), ConViT, NSCIL, LRCIL, iCaRL, LUCIR, DyTox. Various adaptations (Binary-class, Multi-class, Multi-task learning) were applied to these models.
Author countries
Switzerland, Singapore, China, Belgium