TriDF: Evaluating Perception, Detection, and Hallucination for Interpretable DeepFake Detection

Authors: Jian-Yu Jiang-Lin, Kang-Yang Huang, Ling Zou, Ling Lo, Sheng-Ping Yang, Yu-Wen Tseng, Kun-Hsiang Lin, Chia-Ling Chen, Yu-Ting Ta, Yan-Tsung Wang, Po-Ching Chen, Hongxia Xie, Hong-Han Shuai, Wen-Huang Cheng

Published: 2025-12-11 14:01:01+00:00

AI Summary

This paper introduces TriDF, a comprehensive benchmark designed for interpretable DeepFake detection across image, video, and audio modalities, encompassing 16 DeepFake types from advanced synthesis models. TriDF evaluates models on three crucial aspects: Perception (identifying manipulation artifacts), Detection (classification performance), and Hallucination (explanation reliability). Experiments on state-of-the-art multimodal large language models demonstrate that while accurate perception is vital for reliable detection, hallucination can significantly undermine decision-making, emphasizing the interdependence of these three factors.

Abstract

Advances in generative modeling have made it increasingly easy to fabricate realistic portrayals of individuals, creating serious risks for security, communication, and public trust. Detecting such person-driven manipulations requires systems that not only distinguish altered content from authentic media but also provide clear and reliable reasoning. In this paper, we introduce TriDF, a comprehensive benchmark for interpretable DeepFake detection. TriDF contains high-quality forgeries from advanced synthesis models, covering 16 DeepFake types across image, video, and audio modalities. The benchmark evaluates three key aspects: Perception, which measures the ability of a model to identify fine-grained manipulation artifacts using human-annotated evidence; Detection, which assesses classification performance across diverse forgery families and generators; and Hallucination, which quantifies the reliability of model-generated explanations. Experiments on state-of-the-art multimodal large language models show that accurate perception is essential for reliable detection, but hallucination can severely disrupt decision-making, revealing the interdependence of these three aspects. TriDF provides a unified framework for understanding the interaction between detection accuracy, evidence identification, and explanation reliability, offering a foundation for building trustworthy systems that address real-world synthetic media threats.


Key findings
Accurate perception of manipulation artifacts is essential for robust DeepFake detection, but hallucinated explanations can severely disrupt models' decision-making and lead to misclassifications, revealing a critical interdependence between perception, detection, and explanation reliability. Current MLLMs struggle significantly more with semantic artifacts requiring common sense reasoning than with localized quality artifacts, and localization hints do not reliably improve their spatial focus. Additionally, strong hallucination can keep detection accuracy near chance levels even with high perceptual coverage.
Approach
The authors introduce TriDF, a comprehensive benchmark for interpretable DeepFake detection, featuring 5K high-quality DeepFake samples across 16 manipulation types and image, video, and audio modalities. It evaluates models on Perception (identifying fine-grained manipulation artifacts with human annotations), Detection (assessing classification performance), and Hallucination (quantifying explanation reliability) using True-False, Multiple-Choice, and Open-Ended questions.
Datasets
FaceForensics++, FFHQ, CelebAMaskHQ, CelebA-HQ, VGGFace2, Emu Edit, GEdit-Bench, ImgEdit, OmniContext, MS-COCO, Flickr30k, LAION-Aesthetics, VoxCeleb2, LRS2, TalkingHead-1KH, VPBench, FiVE-Bench, HDTF, CelebV-Text, Fashion Video, TED-talks, TikTok, A2 Bench, OpenS2V-Nexus, ConsisID, Panda-70M, HOIGen-1M, EMIME, VCTK, LibriTTS, LibriSpeech
Model(s)
GPT-5, Gemini 2.5-pro, Claude-Sonnet-4.5, InternVL2.5-8B, InternVL2.5-26B, InternVL2.5-38B, InternVL3.5-8B, InternVL3.5-38B-A3B, LLaVA-OV-7B, LLaVA-OV-72B, Qwen3-Omni-30B-A3B, Qwen3-VL-8B, Qwen3-VL-30B, MiniCPM-V-2.6, MiMo-VL-7B, Idefics2-8B, Mantis-8B, Phi-4, InternLM-XComposer2.5, mPLUG-Owl3-7B, Qwen2-Audio-7B, SALMONN-7B, audio-flamingo-3, FakeShield, FakeVLM, FatFormer, MM-Det, AIDE, DFD-FCG, Co-Spy, D3
Author countries
Taiwan, China