ACM Multimedia Grand Challenge on Detecting Cheapfakes

Authors: Shivangi Aneja, Cise Midoglu, Duc-Tien Dang-Nguyen, Sohail Ahmed Khan, Michael Riegler, Pål Halvorsen, Chris Bregler, Balu Adsumilli

Published: 2022-07-29 08:02:42+00:00

Comment: arXiv admin note: substantial text overlap with arXiv:2107.05297

AI Summary

This paper introduces the ACM Multimedia Grand Challenge on Detecting Cheapfakes, which focuses on the detection of out-of-context (OOC) misuse of real images in news items. The challenge aims to develop and benchmark models capable of identifying OOC images by analyzing inconsistencies between news images and their associated captions. Participants are tasked with detecting conflicting image-caption triplets and identifying fake captions based on the COSMOS dataset.

Abstract

Cheapfake is a recently coined term that encompasses non-AI (``cheap'') manipulations of multimedia content. Cheapfakes are known to be more prevalent than deepfakes. Cheapfake media can be created using editing software for image/video manipulations, or even without using any software, by simply altering the context of an image/video by sharing the media alongside misleading claims. This alteration of context is referred to as out-of-context (OOC) misuse of media. OOC media is much harder to detect than fake media, since the images and videos are not tampered. In this challenge, we focus on detecting OOC images, and more specifically the misuse of real photographs with conflicting image captions in news items. The aim of this challenge is to develop and benchmark models that can be used to detect whether given samples (news image and associated captions) are OOC, based on the recently compiled COSMOS dataset.


Key findings
As a challenge description, the paper does not present experimental findings but outlines its goals and evaluation criteria. It aims to motivate researchers to develop novel methods for detecting out-of-context image misuse, a relatively unexplored area compared to deepfakes. The challenge will benchmark proposed models based on effectiveness (accuracy, precision, recall, F1-score, MCC) and efficiency (latency, number of parameters, model size), encouraging the understanding and generation of supervised datasets for this problem.
Approach
This paper describes a grand challenge rather than proposing a specific detection model. The challenge defines two tasks for participants: 1) identifying conflicting image-caption triplets (classifying as OOC or NOOC), and 2) detecting fake captions when only a single image-caption pair is given. The challenge provides the COSMOS dataset, which includes images, multiple captions, and metadata like modified captions (using Spacy NER) and object bounding boxes (from Detectron2) to facilitate model development.
Datasets
COSMOS dataset
Model(s)
UNKNOWN
Author countries
Germany, Norway, USA