ASVspoof2019 vs. ASVspoof5: Assessment and Comparison

Authors: Avishai Weizman, Yehuda Ben-Shimol, Itshak Lapidot

Published: 2025-05-21 18:04:44+00:00

Comment: 5 pages, 3 figures. Accepted to Interspeech 2025 Conference

Journal Ref: https://www.isca-archive.org/interspeech_2025/weizman25_interspeech.pdf

AI Summary

This paper analyzes and compares the ASVspoof2019 and ASVspoof5 challenge databases, focusing on changes in database conditions and their impact on spoofing detection. It highlights that ASVspoof5 introduces significant mismatches in both bona fide and spoofed speech statistics, making it considerably more challenging than ASVspoof2019. The study demonstrates that in ASVspoof5, genuine speech statistically shifts closer to spoofed speech, increasing the difficulty for countermeasure systems.

Abstract

ASVspoof challenges are designed to advance the understanding of spoofing speech attacks and encourage the development of robust countermeasure systems. These challenges provide a standardized database for assessing and comparing spoofing-robust automatic speaker verification solutions. The ASVspoof5 challenge introduces a shift in database conditions compared to ASVspoof2019. While ASVspoof2019 has mismatched conditions only in spoofing attacks in the evaluation set, ASVspoof5 incorporates mismatches in both bona fide and spoofed speech statistics. This paper examines the impact of these mismatches, presenting qualitative and quantitative comparisons within and between the two databases. We show the increased difficulty for genuine and spoofed speech and demonstrate that in ASVspoof5, not only are the attacks more challenging, but the genuine speech also shifts toward spoofed speech compared to ASVspoof2019.


Key findings
ASVspoof2019 exhibited matched conditions for bona fide speech across its subsets, whereas ASVspoof5 introduced significant mismatches in bona fide speech distributions, posing a new challenge. In ASVspoof5, attacks are more sophisticated, and genuine speech statistically converges closer to spoofed speech, making the distinction more difficult. The application of codecs in ASVspoof5's evaluation set further impacts data distribution, increasing the complexity of anti-spoofing tasks.
Approach
The authors assess databases using waveform Probability Mass Function (PMF) distributions, various similarity measures (e.g., Symmetric Kullback-Leibler, Modified Kolmogorov-Smirnov, Hellinger distance), and PMF-based embeddings. Uniform Manifold Approximation and Projection (UMAP) is employed for dimensionality reduction and visualization to analyze data distributions. For evaluating the impact of mismatches, a One-Class Softmax (OCS) countermeasure system (from prior work) is utilized to calculate bona fide miss rates.
Datasets
ASVspoof2019, ASVspoof5
Model(s)
One-Class Softmax (OCS) system (from [33])
Author countries
Israel, France