Is Audio Spoof Detection Robust to Laundering Attacks?

Authors: Hashim Ali, Surya Subramani, Shefali Sudhir, Raksha Varahamurthy, Hafiz Malik

Published: 2024-08-27 00:35:29+00:00

AI Summary

This paper introduces a new laundering attack database, ASVSpoof Laundering Database, created by applying various real-world audio distortions to the ASVSpoof 2019 database. Seven state-of-the-art audio spoof detection approaches are evaluated on this new database, revealing their vulnerability to these attacks.

Abstract

Voice-cloning (VC) systems have seen an exceptional increase in the realism of synthesized speech in recent years. The high quality of synthesized speech and the availability of low-cost VC services have given rise to many potential abuses of this technology. Several detection methodologies have been proposed over the years that can detect voice spoofs with reasonably good accuracy. However, these methodologies are mostly evaluated on clean audio databases, such as ASVSpoof 2019. This paper evaluates SOTA Audio Spoof Detection approaches in the presence of laundering attacks. In that regard, a new laundering attack database, called the ASVSpoof Laundering Database, is created. This database is based on the ASVSpoof 2019 (LA) eval database comprising a total of 1388.22 hours of audio recordings. Seven SOTA audio spoof detection approaches are evaluated on this laundered database. The results indicate that SOTA systems perform poorly in the presence of aggressive laundering attacks, especially reverberation and additive noise attacks. This suggests the need for robust audio spoof detection.


Key findings
State-of-the-art audio spoof detection systems perform poorly when subjected to aggressive laundering attacks, particularly reverberation and additive noise. End-to-end learning systems generally outperform other approaches, but still show significant performance degradation under these attacks.
Approach
The authors created a new dataset by applying various audio processing techniques (reverberation, additive noise, recompression, resampling, low-pass filtering) to the ASVSpoof 2019 dataset. Seven existing audio spoof detection models were then evaluated on this augmented dataset to assess their robustness to these common real-world audio degradations.
Datasets
ASVSpoof 2019 (LA) eval database, ASVSpoof Laundering Database (created by the authors)
Model(s)
CQCC-GMM, LFCC-GMM, LFCC-LCNN, OC-Softmax, RawNet2, RawGat-ST, AASIST
Author countries
USA