Qualitative Failures of Image Generation Models and Their Application in Detecting Deepfakes

View on arXiv ← Back to list

Authors: Ali Borji

Published: 2023-03-29 15:26:44+00:00

AI Summary

This research paper identifies qualitative shortcomings in image generation models, classifying them into five categories (human/animal body parts, geometry, physics, semantics/logic, text/noise/details). Understanding these failures helps improve model development and creates strategies for detecting deepfakes.

Abstract

The ability of image and video generation models to create photorealistic images has reached unprecedented heights, making it difficult to distinguish between real and fake images in many cases. However, despite this progress, a gap remains between the quality of generated images and those found in the real world. To address this, we have reviewed a vast body of literature from both academic publications and social media to identify qualitative shortcomings in image generation models, which we have classified into five categories. By understanding these failures, we can identify areas where these models need improvement, as well as develop strategies for detecting deep fakes. The prevalence of deep fakes in today's society is a serious concern, and our findings can help mitigate their negative impact.

Key findings

The paper outlines five categories of qualitative failures in image generation models, offering a checklist for evaluating such models and detecting deepfakes. These failures range from issues with human anatomy and physics to problems with semantics and text rendering. The authors suggest that a combination of these indicators may be needed for effective deepfake detection.

Approach

The authors reviewed academic publications and social media to identify qualitative failures in image generation models. These failures were categorized to understand model limitations and inform deepfake detection strategies. A collection of examples illustrating these failures was also made available.

Datasets

DiffusionDB, images from Twitter, LinkedIn, Discord, Reddit, thisxdoesnotexist.com, whichfaceisreal.com, Adobe Stock library, and openart.ai

Model(s)

UNKNOWN

Author countries

UNKNOWN

← Previous