ASVspoof2019 vs. ASVspoof5: Assessment and Comparison

Authors: Avishai Weizman, Yehuda Ben-Shimol, Itshak Lapidot

Published: 2025-05-21 18:04:44+00:00

AI Summary

This paper compares the ASVspoof2019 and ASVspoof5 databases for automatic speaker verification spoofing detection, highlighting the increased difficulty in ASVspoof5 due to mismatched conditions in both bona fide and spoofed speech statistics, and showing that genuine speech in ASVspoof5 is statistically closer to spoofed speech than in ASVspoof2019.

Abstract

ASVspoof challenges are designed to advance the understanding of spoofing speech attacks and encourage the development of robust countermeasure systems. These challenges provide a standardized database for assessing and comparing spoofing-robust automatic speaker verification solutions. The ASVspoof5 challenge introduces a shift in database conditions compared to ASVspoof2019. While ASVspoof2019 has mismatched conditions only in spoofing attacks in the evaluation set, ASVspoof5 incorporates mismatches in both bona fide and spoofed speech statistics. This paper examines the impact of these mismatches, presenting qualitative and quantitative comparisons within and between the two databases. We show the increased difficulty for genuine and spoofed speech and demonstrate that in ASVspoof5, not only are the attacks more challenging, but the genuine speech also shifts toward spoofed speech compared to ASVspoof2019.


Key findings
ASVspoof5 presents a significantly more challenging task than ASVspoof2019 due to mismatched conditions in both bona fide and spoofed speech. The bona fide speech in ASVspoof5 shows greater variability and is statistically closer to spoofed speech than in ASVspoof2019. Codec application in ASVspoof5 further increases the difficulty of detection.
Approach
The authors use probability mass functions (PMFs) of speech waveforms to compare the datasets. Similarity measures and dimensionality reduction (UMAP) are applied to the PMF-based embeddings to visualize and quantify the differences between the datasets, focusing on matched and mismatched conditions.
Datasets
ASVspoof2019 and ASVspoof5 databases (Logical Access subsets)
Model(s)
UMAP for dimensionality reduction; a one-class softmax (OCS) system from prior work is used for evaluating the impact of mismatches on detection performance. PMF-based embeddings are calculated and used as input to UMAP and the OCS system.
Author countries
Israel, France