OpenForensics: Large-Scale Challenging Dataset For Multi-Face Forgery Detection And Segmentation In-The-Wild

Authors: Trung-Nghia Le, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen

Published: 2021-07-30 08:15:41+00:00

Comment: Accepted to ICCV 2021. Project page: https://sites.google.com/view/ltnghia/research/openforensics

AI Summary

This paper introduces OpenForensics, the first large-scale, challenging dataset for multi-face forgery detection and segmentation in-the-wild. It aims to promote these new tasks, which involve localizing forged faces among multiple human faces in unrestricted natural scenes. The authors also establish a suite of benchmarks by evaluating state-of-the-art instance detection and segmentation methods on their newly created dataset.

Abstract

The proliferation of deepfake media is raising concerns among the public and relevant authorities. It has become essential to develop countermeasures against forged faces in social media. This paper presents a comprehensive study on two new countermeasure tasks: multi-face forgery detection and segmentation in-the-wild. Localizing forged faces among multiple human faces in unrestricted natural scenes is far more challenging than the traditional deepfake recognition task. To promote these new tasks, we have created the first large-scale dataset posing a high level of challenges that is designed with face-wise rich annotations explicitly for face forgery detection and segmentation, namely OpenForensics. With its rich annotations, our OpenForensics dataset has great potentials for research in both deepfake prevention and general human face detection. We have also developed a suite of benchmarks for these tasks by conducting an extensive evaluation of state-of-the-art instance detection and segmentation methods on our newly constructed dataset in various scenarios. The dataset, benchmark results, codes, and supplementary materials will be publicly available on our project page: https://sites.google.com/view/ltnghia/research/openforensics


Key findings
The OpenForensics dataset poses significant challenges, with human accuracy in detecting forged faces dropping sharply as the number of manipulated faces increases. Benchmark results showed a substantial drop in performance for all state-of-the-art detection and segmentation methods on the test-challenge set compared to the standard test-development set, indicating weak robustness to unseen, real-world scenarios. This highlights that multi-face forgery detection and segmentation in-the-wild are far from being solved.
Approach
The authors created the OpenForensics dataset by collecting real human images, synthesizing forged faces using GAN models and an iterative spoofing process, and applying extensive multi-task annotations. They then benchmarked state-of-the-art instance detection and segmentation models on this dataset to evaluate performance on multi-face forgery detection and segmentation tasks.
Datasets
OpenForensics, Google Open Images
Model(s)
UNKNOWN
Author countries
Japan