Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification

Authors: Songxiang Liu, Haibin Wu, Hung-yi Lee, Helen Meng

Published: 2019-10-19 07:28:39+00:00

AI Summary

This paper investigates the vulnerability of automatic speaker verification (ASV) spoofing countermeasures to adversarial attacks. The authors implement high-performing countermeasure models and test their robustness against Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) attacks in both white-box and black-box scenarios.

Abstract

High-performance spoofing countermeasure systems for automatic speaker verification (ASV) have been proposed in the ASVspoof 2019 challenge. However, the robustness of such systems under adversarial attacks has not been studied yet. In this paper, we investigate the vulnerability of spoofing countermeasures for ASV under both white-box and black-box adversarial attacks with the fast gradient sign method (FGSM) and the projected gradient descent (PGD) method. We implement high-performing countermeasure models in the ASVspoof 2019 challenge and conduct adversarial attacks on them. We compare performance of black-box attacks across spoofing countermeasure models with different network architectures and different amount of model parameters. The experimental results show that all implemented countermeasure models are vulnerable to FGSM and PGD attacks under the scenario of white-box attack. The more dangerous black-box attacks also prove to be effective by the experimental results.


Key findings
All three implemented countermeasure models were vulnerable to both FGSM and PGD attacks in white-box scenarios. Black-box attacks were also effective, demonstrating that these ASV spoofing countermeasures are not robust against adversarial examples. Models with more parameters were more resilient to attacks but were still vulnerable.
Approach
The authors implemented high-performing spoofing countermeasure models from the ASVspoof 2019 challenge. They then used FGSM and PGD methods to generate adversarial audio examples, evaluating the models' performance under both white-box (full model knowledge) and black-box (limited model knowledge) attack scenarios.
Datasets
ASVspoof 2019 dataset (LA partition)
Model(s)
LCNN-big, LCNN-small, and SENet12. These models utilized architectures similar to those submitted to the ASVspoof 2019 challenge, employing A-Softmax and traditional softmax loss functions.
Author countries
Hong Kong, Taiwan