D-CAPTCHA++: A Study of Resilience of Deepfake CAPTCHA under Transferable Imperceptible Adversarial Attack

View on arXiv ← Back to list

Authors: Hong-Hanh Nguyen-Le, Van-Tuan Tran, Dinh-Thuc Nguyen, Nhien-An Le-Khac

Published: 2024-09-11 16:25:02+00:00

AI Summary

This paper investigates the resilience of the D-CAPTCHA system against transferable imperceptible adversarial attacks. It exposes vulnerabilities in D-CAPTCHA and proposes D-CAPTCHA++, a more robust version that uses adversarial training to mitigate these vulnerabilities, significantly improving the system's resistance to attacks.

Abstract

The advancements in generative AI have enabled the improvement of audio synthesis models, including text-to-speech and voice conversion. This raises concerns about its potential misuse in social manipulation and political interference, as synthetic speech has become indistinguishable from natural human speech. Several speech-generation programs are utilized for malicious purposes, especially impersonating individuals through phone calls. Therefore, detecting fake audio is crucial to maintain social security and safeguard the integrity of information. Recent research has proposed a D-CAPTCHA system based on the challenge-response protocol to differentiate fake phone calls from real ones. In this work, we study the resilience of this system and introduce a more robust version, D-CAPTCHA++, to defend against fake calls. Specifically, we first expose the vulnerability of the D-CAPTCHA system under transferable imperceptible adversarial attack. Secondly, we mitigate such vulnerability by improving the robustness of the system by using adversarial training in D-CAPTCHA deepfake detectors and task classifiers.

Key findings

Adversarial training significantly reduced the attack success rate for both deepfake detectors and task classifiers in D-CAPTCHA++. The success rate of transferable adversarial attacks decreased from 31.31% to 0.60% for the task classifier and from 32.26% to 2.27% for the deepfake detector. Feature extraction techniques were shown to impact the transferability of adversarial examples.

Approach

The authors improve the D-CAPTCHA system by using adversarial training on its deepfake detectors and task classifiers. This involves creating imperceptible adversarial audio samples using a surrogate model and then training the main detectors and classifiers on these samples to improve their robustness against such attacks.

Datasets

WaveFake, ASVspoof 2019, ASVspoof 2021, AudioSet, HumTrans, GTZAN, VocalSet, CREMA-D, RAVDESS, VocalSound, DASED, DESED, VCTK

Model(s)

LCNN (surrogate model), SpecRNet, RawNet2, RawNet3, ResNet18, GMM, kNN-VC, Urhythmic, TriAAN-VC

Author countries

Ireland, Ireland, Vietnam, Ireland

← Previous