Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals

Authors: Tomi Kinnunen, Héctor Delgado, Nicholas Evans, Kong Aik Lee, Ville Vestman, Andreas Nautsch, Massimiliano Todisco, Xin Wang, Md Sahidullah, Junichi Yamagishi, Douglas A. Reynolds

Published: 2020-07-12 12:44:08+00:00

AI Summary

This paper presents extensions to the tandem detection cost function (t-DCF), a risk-based approach for assessing spoofing countermeasures (CMs) used with automatic speaker verification (ASV). These extensions include a simplified t-DCF, analysis of a fixed ASV system case, simulations for interpretation, and new analyses using the ASVspoof 2019 database.

Abstract

Recent years have seen growing efforts to develop spoofing countermeasures (CMs) to protect automatic speaker verification (ASV) systems from being deceived by manipulated or artificial inputs. The reliability of spoofing CMs is typically gauged using the equal error rate (EER) metric. The primitive EER fails to reflect application requirements and the impact of spoofing and CMs upon ASV and its use as a primary metric in traditional ASV research has long been abandoned in favour of risk-based approaches to assessment. This paper presents several new extensions to the tandem detection cost function (t-DCF), a recent risk-based approach to assess the reliability of spoofing CMs deployed in tandem with an ASV system. Extensions include a simplified version of the t-DCF with fewer parameters, an analysis of a special case for a fixed ASV system, simulations which give original insights into its interpretation and new analyses using the ASVspoof 2019 database. It is hoped that adoption of the t-DCF for the CM assessment will help to foster closer collaboration between the anti-spoofing and ASV research communities.


Key findings
The ASV-constrained t-DCF values are systematically higher than unconstrained t-DCF values, as expected. No CMs reached the ASV floor, suggesting room for improvement. The minimum t-DCF decreases with increasing spoofing prior, highlighting the benefit of CMs when spoofing attacks are more likely.
Approach
The authors extend the tandem detection cost function (t-DCF) to better evaluate the performance of spoofing countermeasures in tandem with automatic speaker verification systems. This involves simplifying the t-DCF, analyzing a fixed ASV system scenario, and conducting simulations and new analyses using the ASVspoof 2019 database.
Datasets
ASVspoof 2015, ASVspoof 2017, ASVspoof 2019 (LA and PA), VoxCeleb1 and VoxCeleb2
Model(s)
Time-delay neural network (TDNN) based x-vector speaker embeddings with a probabilistic linear discriminant analysis (PLDA) backend.
Author countries
Finland, Spain, France, Japan, USA