Quantum-Trained Convolutional Neural Network for Deepfake Audio Detection

Authors: Chu-Hsuan Abraham Lin, Chen-Yu Liu, Samuel Yen-Chi Chen, Kuan-Cheng Chen

Published: 2024-10-11 20:52:10+00:00

AI Summary

This paper proposes a Quantum-Trained Convolutional Neural Network (QT-CNN) for deepfake audio detection. The QT-CNN uses a hybrid quantum-classical approach, reducing the number of trainable parameters by up to 70% compared to classical CNNs without sacrificing accuracy.

Abstract

The rise of deepfake technologies has posed significant challenges to privacy, security, and information integrity, particularly in audio and multimedia content. This paper introduces a Quantum-Trained Convolutional Neural Network (QT-CNN) framework designed to enhance the detection of deepfake audio, leveraging the computational power of quantum machine learning (QML). The QT-CNN employs a hybrid quantum-classical approach, integrating Quantum Neural Networks (QNNs) with classical neural architectures to optimize training efficiency while reducing the number of trainable parameters. Our method incorporates a novel quantum-to-classical parameter mapping that effectively utilizes quantum states to enhance the expressive power of the model, achieving up to 70% parameter reduction compared to classical models without compromising accuracy. Data pre-processing involved extracting essential audio features, label encoding, feature scaling, and constructing sequential datasets for robust model evaluation. Experimental results demonstrate that the QT-CNN achieves comparable performance to traditional CNNs, maintaining high accuracy during training and testing phases across varying configurations of QNN blocks. The QT framework's ability to reduce computational overhead while maintaining performance underscores its potential for real-world applications in deepfake detection and other resource-constrained scenarios. This work highlights the practical benefits of integrating quantum computing into artificial intelligence, offering a scalable and efficient approach to advancing deepfake detection technologies.


Key findings
The QT-CNN achieves comparable accuracy to traditional CNNs while significantly reducing the number of trainable parameters (up to 70%). This demonstrates the potential of the QT framework for resource-efficient deepfake audio detection. The model maintains high accuracy during both training and testing phases across different QNN block configurations.
Approach
The authors use a hybrid quantum-classical approach. Quantum Neural Networks (QNNs) are integrated with classical CNN architectures to optimize training. A novel quantum-to-classical parameter mapping efficiently utilizes quantum states to enhance the model's expressive power.
Datasets
DEEP-VOICE dataset
Model(s)
Quantum-Trained Convolutional Neural Network (QT-CNN), Classical Convolutional Neural Network (CNN)
Author countries
UK, Taiwan, USA