Detecting GAN-generated Imagery using Color Cues

Authors: Scott McCloskey, Michael Albright

Published: 2018-12-19 21:12:00+00:00

AI Summary

This paper proposes two methods for detecting GAN-generated images by analyzing color cues. The first method leverages the frequency of saturated pixels, while the second uses color channel correlations. Both methods are based on an analysis of the GAN generator's architecture and its handling of color information.

Abstract

Image forensics is an increasingly relevant problem, as it can potentially address online disinformation campaigns and mitigate problematic aspects of social media. Of particular interest, given its recent successes, is the detection of imagery produced by Generative Adversarial Networks (GANs), e.g. `deepfakes'. Leveraging large training sets and extensive computing resources, recent work has shown that GANs can be trained to generate synthetic imagery which is (in some ways) indistinguishable from real imagery. We analyze the structure of the generating network of a popular GAN implementation, and show that the network's treatment of color is markedly different from a real camera in two ways. We further show that these two cues can be used to distinguish GAN-generated imagery from camera imagery, demonstrating effective discrimination between GAN imagery and real camera images used to train the GAN.


Key findings
The saturation-based method showed good discrimination between GAN-generated and real images, especially for fully GAN-generated images (AUC of 0.7). The color correlation method, however, performed poorly (AUC around 0.5), likely due to limitations in the training data and the pre-trained model used.
Approach
The authors analyze the GAN generator architecture to identify two color-related cues: the frequency of saturated pixels and the correlation between color channels. They then develop separate detectors for each cue, using a Support Vector Machine for saturated pixel frequency and a pre-trained Intensity Noise Histogram network for color channel correlation.
Datasets
Two benchmark datasets from the US National Institute of Standards and Technology’s Media Forensics Challenge 2018: GAN Crop images and GAN Full images. The GAN Crop images are smaller regions that are either entirely GAN-generated or not, whereas GAN Full images are mostly camera images with some GAN-generated faces replacing real faces.
Model(s)
Support Vector Machine (SVM) and a pre-trained Intensity Noise Histogram (INH) network.
Author countries
UNKNOWN