Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes

Authors: Ammarah Hashmi, Sahibzada Adil Shahzad, Chia-Wen Lin, Yu Tsao, Hsin-Min Wang

Published: 2024-05-07 07:57:15+00:00

AI Summary

This paper investigates human perception of audiovisual deepfakes by conducting a subjective study with 110 participants who assessed the authenticity of 40 videos. The study then compares human performance against five state-of-the-art AI deepfake detection models. Findings indicate that all AI models significantly outperform human observers, who tend to overestimate their own detection abilities despite performing only marginally better than random chance.

Abstract

The emergence of contemporary deepfakes has attracted significant attention in machine learning research, as artificial intelligence (AI) generated synthetic media increases the incidence of misinterpretation and is difficult to distinguish from genuine content. Currently, machine learning techniques have been extensively studied for automatically detecting deepfakes. However, human perception has been less explored. Malicious deepfakes could ultimately cause public and social problems. Can we humans correctly perceive the authenticity of the content of the videos we watch? The answer is obviously uncertain; therefore, this paper aims to evaluate the human ability to discern deepfake videos through a subjective study. We present our findings by comparing human observers to five state-ofthe-art audiovisual deepfake detection models. To this end, we used gamification concepts to provide 110 participants (55 native English speakers and 55 non-native English speakers) with a webbased platform where they could access a series of 40 videos (20 real and 20 fake) to determine their authenticity. Each participant performed the experiment twice with the same 40 videos in different random orders. The videos are manually selected from the FakeAVCeleb dataset. We found that all AI models performed better than humans when evaluated on the same 40 videos. The study also reveals that while deception is not impossible, humans tend to overestimate their detection capabilities. Our experimental results may help benchmark human versus machine performance, advance forensics analysis, and enable adaptive countermeasures.


Key findings
All five state-of-the-art AI models performed significantly better than human observers in detecting audiovisual deepfakes. Humans achieved an average accuracy of 65.64%, only marginally above chance, and exhibited overconfidence in their detection capabilities. Factors such as age, gender, and native language influenced human performance, but forewarning about deepfakes and self-reported IT skill level did not significantly improve accuracy.
Approach
The authors conducted a subjective study using a web-based platform, where 110 participants evaluated 40 manually selected videos (20 real, 20 fake) from the FakeAVCeleb dataset for authenticity. Human detection accuracy and confidence were analyzed across various demographic factors and compared against the performance of five pre-trained state-of-the-art AI deepfake detection models on the same video set.
Datasets
FakeAVCeleb
Model(s)
LipForensics, AV-Lip-Sync, AV-Lip-Sync+, CNN-Ensemble, AVTENet
Author countries
Taiwan