Quantum-Trained Convolutional Neural Network for Deepfake Audio Detection

Authors: Chu-Hsuan Abraham Lin, Chen-Yu Liu, Samuel Yen-Chi Chen, Kuan-Cheng Chen

Published: 2024-10-11 20:52:10+00:00

AI Summary

This paper introduces a Quantum-Trained Convolutional Neural Network (QT-CNN) framework for enhanced deepfake audio detection, leveraging a hybrid quantum-classical approach. The QT-CNN integrates Quantum Neural Networks (QNNs) with classical CNNs to optimize training efficiency and significantly reduce trainable parameters. It achieves comparable detection performance to traditional CNNs while reducing parameter count by up to 70%.

Abstract

The rise of deepfake technologies has posed significant challenges to privacy, security, and information integrity, particularly in audio and multimedia content. This paper introduces a Quantum-Trained Convolutional Neural Network (QT-CNN) framework designed to enhance the detection of deepfake audio, leveraging the computational power of quantum machine learning (QML). The QT-CNN employs a hybrid quantum-classical approach, integrating Quantum Neural Networks (QNNs) with classical neural architectures to optimize training efficiency while reducing the number of trainable parameters. Our method incorporates a novel quantum-to-classical parameter mapping that effectively utilizes quantum states to enhance the expressive power of the model, achieving up to 70% parameter reduction compared to classical models without compromising accuracy. Data pre-processing involved extracting essential audio features, label encoding, feature scaling, and constructing sequential datasets for robust model evaluation. Experimental results demonstrate that the QT-CNN achieves comparable performance to traditional CNNs, maintaining high accuracy during training and testing phases across varying configurations of QNN blocks. The QT framework's ability to reduce computational overhead while maintaining performance underscores its potential for real-world applications in deepfake detection and other resource-constrained scenarios. This work highlights the practical benefits of integrating quantum computing into artificial intelligence, offering a scalable and efficient approach to advancing deepfake detection technologies.


Key findings
The QT-CNN achieves comparable deepfake audio detection accuracy to traditional CNNs across various QNN block configurations. It significantly reduces the number of trainable parameters by up to 70% compared to classical models, demonstrating improved parameter efficiency without compromising performance. This highlights its potential for real-world applications in resource-constrained environments.
Approach
The QT-CNN employs a hybrid quantum-classical training mechanism where quantum circuits (QNNs with parameterized Ry gates and CNOT gates) are used to optimize the weight parameters of a classical CNN. A novel quantum-to-classical parameter mapping translates quantum state measurement probabilities to classical CNN parameters, enabling efficient training on classical hardware during inference.
Datasets
DEEP-VOICE dataset
Model(s)
Quantum-Trained Convolutional Neural Network (QT-CNN) incorporating Quantum Neural Networks (QNNs) with a classical CNN (consisting of two convolutional layers and two fully connected layers).
Author countries
UK, Taiwan, USA