Audio Spoofing Verification using Deep Convolutional Neural Networks by Transfer Learning

Authors: Rahul T P, P R Aravind, Ranjith C, Usamath Nechiyil, Nandakumar Paramparambath

Published: 2020-08-08 07:14:40+00:00

AI Summary

This paper proposes a deep convolutional neural network (DCNN) based speech classifier for detecting spoofing attacks in speaker verification systems. Using a ResNet-34 architecture and Mel-spectrograms, the model achieves low equal error rates (EER) on the ASVspoof 2019 dataset for both logical and physical access scenarios.

Abstract

Automatic Speaker Verification systems are gaining popularity these days; spoofing attacks are of prime concern as they make these systems vulnerable. Some spoofing attacks like Replay attacks are easier to implement but are very hard to detect thus creating the need for suitable countermeasures. In this paper, we propose a speech classifier based on deep-convolutional neural network to detect spoofing attacks. Our proposed methodology uses acoustic time-frequency representation of power spectral densities on Mel frequency scale (Mel-spectrogram), via deep residual learning (an adaptation of ResNet-34 architecture). Using a single model system, we have achieved an equal error rate (EER) of 0.9056% on the development and 5.32% on the evaluation dataset of logical access scenario and an equal error rate (EER) of 5.87% on the development and 5.74% on the evaluation dataset of physical access scenario of ASVspoof 2019.


Key findings
The proposed single-model system achieved EERs of 0.9056% and 5.32% on the development and evaluation datasets, respectively, for the logical access scenario and 5.87% and 5.74% for the physical access scenario. This outperforms baseline models provided by the ASVspoof 2019 organizers.
Approach
The authors employ transfer learning with a modified ResNet-34 architecture to classify audio as bona fide or spoofed. Mel-spectrograms of the audio are used as input to the DCNN. The model is trained on the ASVspoof 2019 training dataset and evaluated on its development and evaluation datasets.
Datasets
ASVspoof 2019 dataset (logical and physical access scenarios)
Model(s)
ResNet-34 (modified)
Author countries
India