Study of detecting behavioral signatures within DeepFake videos
Authors: Qiaomu Miao, Sinhwa Kang, Stacy Marsella, Steve DiPaola, Chao Wang, Ari Shapiro
Published: 2022-08-06 18:30:53+00:00
Comment: 9 pages
AI Summary
This paper investigates whether human behavioral signatures, specifically movements, can be distinguished from a person's visual appearance in synthetic videos to detect deepfakes. The authors conduct a user study comparing synthetic videos generated by transferring behavior signals from different sources (different person/utterance, same person/different utterance, different person/same utterance) to a target speaker's appearance. Their findings indicate that synthetic videos, in all cases, are perceived as less real and engaging than original videos, suggesting the detectability of a behavioral signature separate from visual appearance.
Abstract
There is strong interest in the generation of synthetic video imagery of people talking for various purposes, including entertainment, communication, training, and advertisement. With the development of deep fake generation models, synthetic video imagery will soon be visually indistinguishable to the naked eye from a naturally capture video. In addition, many methods are continuing to improve to avoid more careful, forensic visual analysis. Some deep fake videos are produced through the use of facial puppetry, which directly controls the head and face of the synthetic image through the movements of the actor, allow the actor to 'puppet' the image of another. In this paper, we address the question of whether one person's movements can be distinguished from the original speaker by controlling the visual appearance of the speaker but transferring the behavior signals from another source. We conduct a study by comparing synthetic imagery that: 1) originates from a different person speaking a different utterance, 2) originates from the same person speaking a different utterance, and 3) originates from a different person speaking the same utterance. Our study shows that synthetic videos in all three cases are seen as less real and less engaging than the original source video. Our results indicate that there could be a behavioral signature that is detectable from a person's movements that is separate from their visual appearance, and that this behavioral signature could be used to distinguish a deep fake from a properly captured video.