Mitigating Adversarial Attacks in Deepfake Detection: An Exploration of Perturbation and AI Techniques

View on arXiv ← Back to list

Authors: Saminder Dhesi, Laura Fontes, Pedro Machado, Isibor Kennedy Ihianle, Farhad Fassihi Tash, David Ada Adama

Published: 2023-02-22 23:48:19+00:00

AI Summary

This research explores mitigating adversarial attacks in deepfake detection. The authors develop a customized Convolutional Neural Network (CNN) to detect deepfakes, achieving a 76.2% precision rate on the DFDC dataset.

Abstract

Deep learning constitutes a pivotal component within the realm of machine learning, offering remarkable capabilities in tasks ranging from image recognition to natural language processing. However, this very strength also renders deep learning models susceptible to adversarial examples, a phenomenon pervasive across a diverse array of applications. These adversarial examples are characterized by subtle perturbations artfully injected into clean images or videos, thereby causing deep learning algorithms to misclassify or produce erroneous outputs. This susceptibility extends beyond the confines of digital domains, as adversarial examples can also be strategically designed to target human cognition, leading to the creation of deceptive media, such as deepfakes. Deepfakes, in particular, have emerged as a potent tool to manipulate public opinion and tarnish the reputations of public figures, underscoring the urgent need to address the security and ethical implications associated with adversarial examples. This article delves into the multifaceted world of adversarial examples, elucidating the underlying principles behind their capacity to deceive deep learning algorithms. We explore the various manifestations of this phenomenon, from their insidious role in compromising model reliability to their impact in shaping the contemporary landscape of disinformation and misinformation. To illustrate progress in combating adversarial examples, we showcase the development of a tailored Convolutional Neural Network (CNN) designed explicitly to detect deepfakes, a pivotal step towards enhancing model robustness in the face of adversarial threats. Impressively, this custom CNN has achieved a precision rate of 76.2% on the DFDC dataset.

Key findings

The customized CNN achieved a precision of 74.8% on the Real vs. Fake dataset and 76.2% on the DFDC dataset. White-box adversarial attacks significantly reduced the model's accuracy, highlighting the challenge of robustness. The study also demonstrated the model's applicability to other datasets (COVID-19) with limited data.

Approach

The researchers trained a customized CNN and GAN for deepfake detection using the DFDC and COVID datasets. They pre-processed the data, resized images, and split the datasets for training, validation, and testing. The model's performance was evaluated using metrics such as precision, accuracy, and cross-entropy loss.

Datasets

Deepfake Detection Challenge (DFDC), COVID-19 image dataset, Real vs. Fake dataset (140,000 images)

Model(s)

Convolutional Neural Network (CNN), Generative Adversarial Network (GAN), ResNet50

Author countries

← Previous