DeepFake-O-Meter v2.0: An Open Platform for DeepFake Detection

Authors: Yan Ju, Chengzhe Sun, Shan Jia, Shuwei Hou, Zhaofeng Si, Soumyya Kanti Datta, Lipeng Ke, Riky Zhou, Anita Nikolich, Siwei Lyu

Published: 2024-04-19 19:24:20+00:00

AI Summary

DeepFake-O-Meter v2.0 is an open-source platform integrating state-of-the-art DeepFake detection algorithms for images, videos, and audio. It offers a user-friendly interface and improved architecture compared to its predecessor, serving both everyday users and researchers.

Abstract

Deepfakes, as AI-generated media, have increasingly threatened media integrity and personal privacy with realistic yet fake digital content. In this work, we introduce an open-source and user-friendly online platform, DeepFake-O-Meter v2.0, that integrates state-of-the-art methods for detecting Deepfake images, videos, and audio. Built upon DeepFake-O-Meter v1.0, we have made significant upgrades and improvements in platform architecture design, including user interaction, detector integration, job balancing, and security management. The platform aims to offer everyday users a convenient service for analyzing DeepFake media using multiple state-of-the-art detection algorithms. It ensures secure and private delivery of the analysis results. Furthermore, it serves as an evaluation and benchmarking platform for researchers in digital media forensics to compare the performance of multiple algorithms on the same input. We have also conducted detailed usage analysis based on the collected data to gain deeper insights into our platform's statistics. This involves analyzing two-month trends in user activity and evaluating the processing efficiency of each detector.


Key findings
The platform shows significant growth in user submissions since May 1, 2024. Image and video detectors are significantly more popular than audio detectors. The average processing time for image and audio detection is around 30 seconds, while video detection takes approximately 90 seconds.
Approach
The platform integrates multiple existing open-source DeepFake detection models for images, videos, and audio. It uses a Docker containerized approach for efficient model execution and a job balancing module to manage user requests.
Datasets
The abstract mentions a four-month dataset collected from platform usage, but specific datasets used to train the integrated detection models are listed in Table III and include datasets like StyleGAN2, ProGAN, DALL-E, ASVspoof 2019 LA, ASVspoof 2021 DF, and LibriSeVoc.
Model(s)
Multiple models are integrated, including Nodown, GLFF, HIFI, DMImageDetection, CLIP-ViT, NPR, DSP-FWA, WAV2Lip-STA, FTCN, SBI, AltFreezing, LSDA, LIPINC, RawNet2, LFCC-LCNN, RawNet3, RawNet2-Vocoder, and Whisper. These models utilize architectures such as ResNet50, XceptionNet, EfficientNet-b4, 3D ResNet50, and others.
Author countries
USA