AUDDT: Audio Unified Deepfake Detection Benchmark Toolkit
Authors: Yi Zhu, Heitor R. Guimarães, Arthur Pimentel, Tiago Falk
Published: 2025-09-25 21:09:40+00:00
AI Summary
This paper introduces AUDDT, an open-source toolkit for benchmarking audio deepfake detection models across 28 diverse datasets. It aims to automate the evaluation process, providing insights into the generalization capabilities and shortcomings of pretrained detectors. The toolkit also highlights limitations of current datasets and their gap relative to real-world deployment.
Abstract
With the prevalence of artificial intelligence (AI)-generated content, such as audio deepfakes, a large body of recent work has focused on developing deepfake detection techniques. However, most models are evaluated on a narrow set of datasets, leaving their generalization to real-world conditions uncertain. In this paper, we systematically review 28 existing audio deepfake datasets and present an open-source benchmarking toolkit called AUDDT (https://github.com/MuSAELab/AUDDT). The goal of this toolkit is to automate the evaluation of pretrained detectors across these 28 datasets, giving users direct feedback on the advantages and shortcomings of their deepfake detectors. We start by showcasing the usage of the developed toolkit, the composition of our benchmark, and the breakdown of different deepfake subgroups. Next, using a widely adopted pretrained deepfake detector, we present in- and out-of-domain detection results, revealing notable differences across conditions and audio manipulation types. Lastly, we also analyze the limitations of these existing datasets and their gap relative to practical deployment scenarios.