The MeVer DeepFake Detection Service: Lessons Learnt from Developing and Deploying in the Wild

Authors: Spyridon Baxevanakis, Giorgos Kordopatis-Zilos, Panagiotis Galopoulos, Lazaros Apostolidis, Killian Levacher, Ipek B. Schlicht, Denis Teyssou, Ioannis Kompatsiaris, Symeon Papadopoulos

Published: 2022-04-27 10:20:44+00:00

AI Summary

This paper introduces the MeVer DeepFake detection service, a web service employing a model ensemble for detecting deep learning manipulations in images and videos. The service is designed for robustness and transparency, incorporating a model card for clear documentation and evaluation against adversarial attacks.

Abstract

Enabled by recent improvements in generation methodologies, DeepFakes have become mainstream due to their increasingly better visual quality, the increase in easy-to-use generation tools and the rapid dissemination through social media. This fact poses a severe threat to our societies with the potential to erode social cohesion and influence our democracies. To mitigate the threat, numerous DeepFake detection schemes have been introduced in the literature but very few provide a web service that can be used in the wild. In this paper, we introduce the MeVer DeepFake detection service, a web service detecting deep learning manipulations in images and video. We present the design and implementation of the proposed processing pipeline that involves a model ensemble scheme, and we endow the service with a model card for transparency. Experimental results show that our service performs robustly on the three benchmark datasets while being vulnerable to Adversarial Attacks. Finally, we outline our experience and lessons learned when deploying a research system into production in the hopes that it will be useful to other academic and industry teams.


Key findings
The MeVer service shows robust performance on CelebDF and WildDeepFake datasets, achieving high balanced accuracy. However, performance is lower on FaceForensics++, particularly for Expression Swap manipulations. The service is vulnerable to adversarial attacks, highlighting a need for further research in robustness.
Approach
The service uses a multi-stage pipeline: it downloads and preprocesses media, detects and clusters faces, and then uses an ensemble of five EfficientNet and DETR-based models for inference, aggregating predictions for a final DeepFake probability score.
Datasets
FaceForensics++, CelebDF-V2, WildDeepFake
Model(s)
Ensemble of five models: EfficientNet-b4 (3 variations with DETR transformer head), EfficientNet-V2-m
Author countries
Greece, Ireland, Germany, France