Vulnerabilities of Audio-Based Biometric Authentication Systems Against Deepfake Speech Synthesis
Authors: Mengze Hong, Di Jiang, Zeying Xie, Weiwei Zhao, Guan Wang, Chen Jason Zhang
Published: 2026-01-06 10:55:32+00:00
AI Summary
This paper empirically evaluates state-of-the-art speaker authentication systems against modern audio deepfake synthesis. It reveals two critical security vulnerabilities: commercial speaker verification systems are easily bypassed by voice cloning models trained on minimal data, and anti-spoofing detectors fail to generalize robustly to unseen deepfake generation methods. The findings highlight an urgent need for architectural innovations and adaptive multi-factor authentication strategies.
Abstract
As audio deepfakes transition from research artifacts to widely available commercial tools, robust biometric authentication faces pressing security threats in high-stakes industries. This paper presents a systematic empirical evaluation of state-of-the-art speaker authentication systems based on a large-scale speech synthesis dataset, revealing two major security vulnerabilities: 1) modern voice cloning models trained on very small samples can easily bypass commercial speaker verification systems; and 2) anti-spoofing detectors struggle to generalize across different methods of audio synthesis, leading to a significant gap between in-domain performance and real-world robustness. These findings call for a reconsideration of security measures and stress the need for architectural innovations, adaptive defenses, and the transition towards multi-factor authentication.