Datasets, Clues and State-of-the-Arts for Multimedia Forensics: An Extensive Review

Authors: Ankit Yadav, Dinesh Kumar Vishwakarma

Published: 2024-01-13 07:03:58+00:00

AI Summary

This survey paper provides a comprehensive review of deep learning-based approaches for multimedia tampering detection. It analyzes benchmark datasets, tampering clues, common deep learning architectures, and state-of-the-art methods, categorizing them by tampering type (deepfakes, splicing, copy-move, etc.). Finally, it identifies research gaps and future directions in the field.

Abstract

With the large chunks of social media data being created daily and the parallel rise of realistic multimedia tampering methods, detecting and localising tampering in images and videos has become essential. This survey focusses on approaches for tampering detection in multimedia data using deep learning models. Specifically, it presents a detailed analysis of benchmark datasets for malicious manipulation detection that are publicly available. It also offers a comprehensive list of tampering clues and commonly used deep learning architectures. Next, it discusses the current state-of-the-art tampering detection methods, categorizing them into meaningful types such as deepfake detection methods, splice tampering detection methods, copy-move tampering detection methods, etc. and discussing their strengths and weaknesses. Top results achieved on benchmark datasets, comparison of deep learning approaches against traditional methods and critical insights from the recent tampering detection methods are also discussed. Lastly, the research gaps, future direction and conclusion are discussed to provide an in-depth understanding of the tampering detection research arena.


Key findings
Deep learning methods have significantly improved multimedia forgery detection accuracy and localization capabilities compared to traditional methods. However, research gaps remain, particularly regarding the need for larger, more diverse datasets and the development of more robust methods against post-processing attacks and generalized tampering techniques. The use of attention mechanisms, transformers, and multi-modal approaches are promising future directions.
Approach
The paper reviews existing deep learning models for multimedia forgery detection, categorizing them by manipulation type (deepfakes, splicing, copy-move, etc.). It analyzes various datasets, features (clues), and architectures used in these models, comparing deep learning approaches with traditional methods and identifying strengths and weaknesses.
Datasets
Columbia, CASIA, MICC, FORENSIC, FaceForensics, FaceForensics++, VTD, IMD2020, DeeperForensics 1.0, DFDC, BOSSbase, RAISE, UCID, and others mentioned throughout the paper.
Model(s)
CNNs, Auto-Encoders, Generative Adversarial Networks (GANs), Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTMs), Capsule Networks, Deep Belief Networks (DBNs), Transformers, and combinations thereof.
Author countries
India