A Comparative Study of Fusion Methods for SASV Challenge 2022

View on arXiv ← Back to list

Authors: Petr Grinberg, Vladislav Shikhov

Published: 2022-03-31 11:43:01+00:00

AI Summary

This paper investigates various fusion methods for combining embeddings from Automatic Speaker Verification (ASV) and countermeasure (CM) systems in the Spooﬁng Aware Speaker Veriﬁcation (SASV) Challenge 2022. The authors explore different fusion techniques, including boosting over embeddings (CatBoost), which outperforms existing methods.

Abstract

Automatic Speaker Verification (ASV) system is a type of bio-metric authentication. It can be attacked by an intruder, who falsifies data in order to get access to protected information. Countermeasures (CM) are special algorithms that detect these spoofing-attacks. While the ASVspoof Challenge series were focused on the development of CM for fixed ASV system, the new Spoofing Aware Speaker Verification (SASV) Challenge organizers believe that best results can be achieved if CM and ASV systems are optimized jointly. One of the approaches for cooperative optimization is a fusion over embeddings or scores obtained from ASV and CM models. The baselines of SASV Challenge 2022 present two types of fusion: score-sum and back-end ensemble with a 3-layer MLP. This paper describes our research of other fusion methods, including boosting over embeddings, which has not been used in anti-spoofing studies before.

Key findings

CatBoost, a boosting method applied to the combined embeddings, significantly outperforms baseline methods and other tested classifiers in the SASV 2022 challenge. Other fusion methods show varying performance, with some non-linear methods performing better than linear ones for embedding fusion, but score fusion shows better results with logistic regression and SVM.

Approach

The researchers concatenate embeddings from multiple ASV and CM models (including RawNet2 and LightCNN with various spectrograms) into a single feature vector. This vector is then fed into various classifiers (MLP, Logistic Regression, SVM, CatBoost, etc.) to detect spoofing attacks.

Datasets

VoxCeleb2, ASVspoof 2019 LA train partition

Model(s)

ECAPA-TDNN (ASV), AASIST, RawNet2, LightCNN (CM), 3-layer MLP, Logistic Regression, SVM (linear, RBF, polynomial kernels), Random Fourier Features (RFF) with logistic regression, Gaussian Mixture Model (GMM), Random Forest, CatBoost

Author countries

Russia

← Previous