Deepfake detection: humans vs. machines

Authors: Pavel Korshunov, Sébastien Marcel

Published: 2020-09-07 15:20:37+00:00

AI Summary

This paper presents a subjective study comparing human and machine performance in deepfake video detection. Human subjects evaluated 120 videos (60 deepfakes, 60 originals) from the Facebook Deepfake database, while two state-of-the-art deepfake detection algorithms (based on Xception and EfficientNet) were also tested on the same videos. The results reveal significant differences in how humans and machines perceive deepfakes.

Abstract

Deepfake videos, where a person's face is automatically swapped with a face of someone else, are becoming easier to generate with more realistic results. In response to the threat such manipulations can pose to our trust in video evidence, several large datasets of deepfake videos and many methods to detect them were proposed recently. However, it is still unclear how realistic deepfake videos are for an average person and whether the algorithms are significantly better than humans at detecting them. In this paper, we present a subjective study conducted in a crowdsourcing-like scenario, which systematically evaluates how hard it is for humans to see if the video is deepfake or not. For the evaluation, we used 120 different videos (60 deepfakes and 60 originals) manually pre-selected from the Facebook deepfake database, which was provided in the Kaggle's Deepfake Detection Challenge 2020. For each video, a simple question: Is face of the person in the video real of fake? was answered on average by 19 naive subjects. The results of the subjective evaluation were compared with the performance of two different state of the art deepfake detection methods, based on Xception and EfficientNets (B4 variant) neural networks, which were pre-trained on two other large public databases: the Google's subset from FaceForensics++ and the recent Celeb-DF dataset. The evaluation demonstrates that while the human perception is very different from the perception of a machine, both successfully but in different ways are fooled by deepfakes. Specifically, algorithms struggle to detect those deepfake videos, which human subjects found to be very easy to spot.


Key findings
Humans and algorithms exhibit significantly different detection patterns; algorithms struggle with deepfakes easily detected by humans. Human subjects are more accurate at detecting deepfakes overall, showing that good quality deepfakes can easily fool a majority of people.
Approach
The study uses a crowdsourcing approach to evaluate human perception of deepfakes, categorizing them by detection difficulty. Two pre-trained deepfake detection models (Xception and EfficientNet) are then evaluated on the same videos, and their performance is compared against the human subject results.
Datasets
Facebook Deepfake database (for subjective evaluation); Google's subset from FaceForensics++, Celeb-DF (for model pre-training)
Model(s)
Xception, EfficientNet (B4 variant)
Author countries
Switzerland