AV-Deepfake1M++: A Large-Scale Audio-Visual Deepfake Benchmark with Real-World Perturbations
Authors: Zhixi Cai, Kartik Kuckreja, Shreya Ghosh, Akanksha Chuchra, Muhammad Haris Khan, Usman Tariq, Tom Gedeon, Abhinav Dhall
Published: 2025-07-28 07:27:42+00:00
AI Summary
This paper introduces AV-Deepfake1M++, a large-scale audio-visual deepfake benchmark comprising 2 million video clips with diversified manipulation strategies and extensive audio-visual perturbations. It details the data generation strategies, which include various state-of-the-art deepfake models and real-world perturbations, and provides a benchmark using existing detection methods. The authors aim for this dataset to facilitate research in the deepfake domain and are hosting a detection challenge based on it.
Abstract
The rapid surge of text-to-speech and face-voice reenactment models makes video fabrication easier and highly realistic. To encounter this problem, we require datasets that rich in type of generation methods and perturbation strategy which is usually common for online videos. To this end, we propose AV-Deepfake1M++, an extension of the AV-Deepfake1M having 2 million video clips with diversified manipulation strategy and audio-visual perturbation. This paper includes the description of data generation strategies along with benchmarking of AV-Deepfake1M++ using state-of-the-art methods. We believe that this dataset will play a pivotal role in facilitating research in Deepfake domain. Based on this dataset, we host the 2025 1M-Deepfakes Detection Challenge. The challenge details, dataset and evaluation scripts are available online under a research-only license at https://deepfakes1m.github.io/2025.