MesoNet: a Compact Facial Video Forgery Detection Network

Authors: Darius Afchar, Vincent Nozick, Junichi Yamagishi, Isao Echizen

Published: 2018-09-04 10:59:22+00:00

AI Summary

This paper proposes MesoNet, a compact deep learning network for detecting face tampering in videos, focusing on Deepfake and Face2Face techniques. It achieves high detection rates (over 98% for Deepfake and 95% for Face2Face) by focusing on mesoscopic image properties, overcoming limitations of traditional methods on compressed video data.

Abstract

This paper presents a method to automatically and efficiently detect face tampering in videos, and particularly focuses on two recent techniques used to generate hyper-realistic forged videos: Deepfake and Face2Face. Traditional image forensics techniques are usually not well suited to videos due to the compression that strongly degrades the data. Thus, this paper follows a deep learning approach and presents two networks, both with a low number of layers to focus on the mesoscopic properties of images. We evaluate those fast networks on both an existing dataset and a dataset we have constituted from online videos. The tests demonstrate a very successful detection rate with more than 98% for Deepfake and 95% for Face2Face.


Key findings
MesoNet achieves high detection rates, exceeding 98% for Deepfake and 95% for Face2Face videos. Aggregating frame-level classifications further improves performance, even under real-world video compression conditions. The analysis of network activations suggests that the model focuses on detail differences between real and forged faces, particularly in eye and mouth regions.
Approach
MesoNet uses two lightweight deep neural network architectures (Meso-4 and MesoInception-4) to analyze mesoscopic image properties. These networks efficiently detect forged videos by analyzing intermediate-level features, avoiding the limitations of microscopic and macroscopic analyses on compressed video data. Frame-level classifications are aggregated for improved accuracy.
Datasets
A custom Deepfake dataset created from online videos and the FaceForensics dataset (using Face2Face forged videos).
Model(s)
Meso-4 and MesoInception-4 networks.
Author countries
France, Japan