ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization

Authors: Bo Du, Xuekang Zhu, Xiaochen Ma, Chenfan Qu, Kaiwen Feng, Zhe Yang, Chi-Man Pun, Jian Liu, Jizhe Zhou

Published: 2025-05-16 08:49:59+00:00

AI Summary

ForensicHub is a unified benchmark and codebase for fake image detection and localization across four domains (deepfake, IMDL, AIGC, and document). It addresses the fragmentation of the field by providing a modular architecture and integrating existing benchmarks with new ones for AIGC and document image manipulation.

Abstract

The field of Fake Image Detection and Localization (FIDL) is highly fragmented, encompassing four domains: deepfake detection (Deepfake), image manipulation detection and localization (IMDL), artificial intelligence-generated image detection (AIGC), and document image manipulation localization (Doc). Although individual benchmarks exist in some domains, a unified benchmark for all domains in FIDL remains blank. The absence of a unified benchmark results in significant domain silos, where each domain independently constructs its datasets, models, and evaluation protocols without interoperability, preventing cross-domain comparisons and hindering the development of the entire FIDL field. To close the domain silo barrier, we propose ForensicHub, the first unified benchmark & codebase for all-domain fake image detection and localization. Considering drastic variations on dataset, model, and evaluation configurations across all domains, as well as the scarcity of open-sourced baseline models and the lack of individual benchmarks in some domains, ForensicHub: i) proposes a modular and configuration-driven architecture that decomposes forensic pipelines into interchangeable components across datasets, transforms, models, and evaluators, allowing flexible composition across all domains; ii) fully implements 10 baseline models, 6 backbones, 2 new benchmarks for AIGC and Doc, and integrates 2 existing benchmarks of DeepfakeBench and IMDLBenCo through an adapter-based design; iii) conducts indepth analysis based on the ForensicHub, offering 8 key actionable insights into FIDL model architecture, dataset characteristics, and evaluation standards. ForensicHub represents a significant leap forward in breaking the domain silos in the FIDL field and inspiring future breakthroughs.


Key findings
Surprisingly, general-purpose visual backbones (like ConvNeXt and Swin Transformer) often outperformed domain-specific models when trained under the unified IFF-Protocol. Experiments revealed that shallow feature extraction is generally detrimental with large, diverse datasets. Cross-domain evaluations highlighted the transferability of some models and revealed strengths and weaknesses of existing approaches across different domains.
Approach
ForensicHub uses a modular and configuration-driven architecture to decompose forensic pipelines into interchangeable components (datasets, transforms, models, evaluators). It integrates existing benchmarks (DeepfakeBench and IMDLBenCo) and introduces new benchmarks for AIGC and document images, enabling flexible composition and cross-domain comparisons.
Datasets
FaceForensics++, Celeb-DF, DeepFakeDetection, DFDCP, DFDC, FaceShifter, UADFV, CASIA, COVERAGE, Columbia, IMD2020, NIST16, CocoGlide, Autosplice, DiffusionForensics, GenImage, Doctamper, OSTF, RealTextManipulation, T-SROIE, Tampered-IC13
Model(s)
Capsule-Net, RECCE, SPSL, UCF, SBI, MVSS-Net, CAT-Net, PSCC-Net, Trufor, IML-ViT, Mesorch, Dire, DualNet, HiFiNet, Synthbuster, UnivFD, CAFTB, TIFDM, DTD, FFDN, ResNet, Xception, EfficientNet, Segformer, Swin Transformer, ConvNext
Author countries
China, UAE, Portugal