A Representative Study on Human Detection of Artificially Generated Media Across Countries

Authors: Joel Frank, Franziska Herbert, Jonas Ricker, Lea Schönherr, Thorsten Eisenhofer, Asja Fischer, Markus Dürmuth, Thorsten Holz

Published: 2023-12-10 19:34:52+00:00

AI Summary

This paper presents the first large-scale cross-country and cross-media study investigating human ability to detect AI-generated media (audio, image, and text). The results show that state-of-the-art forgeries are nearly indistinguishable from real media, with participants' accuracy often below chance.

Abstract

AI-generated media has become a threat to our digital society as we know it. These forgeries can be created automatically and on a large scale based on publicly available technology. Recognizing this challenge, academics and practitioners have proposed a multitude of automatic detection strategies to detect such artificial media. However, in contrast to these technical advances, the human perception of generated media has not been thoroughly studied yet. In this paper, we aim at closing this research gap. We perform the first comprehensive survey into people's ability to detect generated media, spanning three countries (USA, Germany, and China) with 3,002 participants across audio, image, and text media. Our results indicate that state-of-the-art forgeries are almost indistinguishable from real media, with the majority of participants simply guessing when asked to rate them as human- or machine-generated. In addition, AI-generated media receive is voted more human like across all media types and all countries. To further understand which factors influence people's ability to detect generated media, we include personal variables, chosen based on a literature review in the domains of deepfake and fake news research. In a regression analysis, we found that generalized trust, cognitive reflection, and self-reported familiarity with deepfakes significantly influence participant's decision across all media categories.


Key findings
Participants struggled to distinguish AI-generated media from real media across all modalities, often performing below chance. Demographic factors showed only marginal influence on detection accuracy, primarily for German participants with audio. Generalized trust and cognitive reflection significantly influenced participants' decisions.
Approach
The researchers conducted a large-scale online survey across the USA, Germany, and China, presenting participants with audio, image, and text samples (real and AI-generated). Participants rated the believability of each sample, and the study analyzed the accuracy of these ratings and the influence of demographic and cognitive factors.
Datasets
LJSpeech (English audio), CSMSC (Chinese audio), HUI (German audio), a subset of Nightingale and Farid's dataset (images), and news articles from NPR (USA), Tagesschau (Germany), and CCTV (China) (text).
Model(s)
Tacotron 2 and Hifi-GAN for audio generation, StyleGAN2 for image generation, and OpenAI's Davinci GPT3 for text generation. No specific models were used for *detection*; the study focused on human detection ability.
Author countries
Germany, USA