Advanced Signal Analysis in Detecting Replay Attacks for Automatic Speaker Verification Systems
Authors: Lee Shih Kuang
Published: 2024-03-02 08:19:58+00:00
AI Summary
This research introduces novel signal analysis methods (arbitrary analysis, mel scale analysis, and constant Q analysis) inspired by the Fourier inversion formula for replay speech detection in automatic speaker verification. These methods improve efficiency and effectiveness in analyzing speech signals, particularly when integrated with temporal autocorrelation of speech features.
Abstract
This study proposes novel signal analysis methods for replay speech detection in automatic speaker verification (ASV) systems. The proposed methods -- arbitrary analysis (AA), mel scale analysis (MA), and constant Q analysis (CQA) -- are inspired by the calculation of the Fourier inversion formula. These methods introduce new perspectives in signal analysis for replay speech detection by employing alternative sinusoidal sequence groups. The efficacy of the proposed methods is examined on the ASVspoof 2019 & 2021 PA databases with experiments, and confirmed by the performance of systems that incorporated the proposed methods; the successful integration of the proposed methods and a speech feature that calculates temporal autocorrelation of speech (TAC) from complex spectra strongly confirms it. Moreover, the proposed CQA and MA methods show their superiority to the conventional methods on efficiency (approximately 2.36 times as fast compared to the conventional constant Q transform (CQT) method) and efficacy, respectively, in analyzing speech signals, making them promising to utilize in music and speech processing works.