Are audio DeepFake detection models polyglots?
Authors: Bartłomiej Marek, Piotr Kawa, Piotr Syga
Published: 2024-12-23 19:32:53+00:00
AI Summary
This research benchmarks multilingual audio deepfake detection by evaluating various adaptation strategies. Experiments analyzing models trained on English datasets, along with intra- and cross-linguistic adaptations, reveal significant variations in detection efficacy, highlighting the importance of target-language data.
Abstract
Since the majority of audio DeepFake (DF) detection methods are trained on English-centric datasets, their applicability to non-English languages remains largely unexplored. In this work, we present a benchmark for the multilingual audio DF detection challenge by evaluating various adaptation strategies. Our experiments focus on analyzing models trained on English benchmark datasets, as well as intra-linguistic (same-language) and cross-linguistic adaptation approaches. Our results indicate considerable variations in detection efficacy, highlighting the difficulties of multilingual settings. We show that limiting the dataset to English negatively impacts the efficacy, while stressing the importance of the data in the target language.