TruthLens: Explainable DeepFake Detection for Face Manipulated and Fully Synthetic Data
Authors: Rohit Kundu, Shan Jia, Vishal Mohanty, Athula Balachandran, Amit K. Roy-Chowdhury
Published: 2025-03-20 05:40:42+00:00
AI Summary
TruthLens is a novel DeepFake detection framework that not only classifies images as real or fake but also provides detailed textual explanations for its predictions. It achieves this by combining the global understanding of a multimodal large language model (PaLiGemma2) with the localized feature extraction of a vision-only model (DINOv2), outperforming state-of-the-art methods in accuracy and explainability.
Abstract
Detecting DeepFakes has become a crucial research area as the widespread use of AI image generators enables the effortless creation of face-manipulated and fully synthetic content, yet existing methods are often limited to binary classification (real vs. fake) and lack interpretability. To address these challenges, we propose TruthLens, a novel and highly generalizable framework for DeepFake detection that not only determines whether an image is real or fake but also provides detailed textual reasoning for its predictions. Unlike traditional methods, TruthLens effectively handles both face-manipulated DeepFakes and fully AI-generated content while addressing fine-grained queries such as Does the eyes/nose/mouth look real or fake? The architecture of TruthLens combines the global contextual understanding of multimodal large language models like PaliGemma2 with the localized feature extraction capabilities of vision-only models like DINOv2. This hybrid design leverages the complementary strengths of both models, enabling robust detection of subtle manipulations while maintaining interpretability. Extensive experiments on diverse datasets demonstrate that TruthLens outperforms state-of-the-art methods in detection accuracy (by 2-14%) and explainability, in both in-domain and cross-data settings, generalizing effectively across traditional and emerging manipulation techniques.