Exploring Specular Reflection Inconsistency for Generalizable Face Forgery Detection

Authors: Hongyan Fei, Zexi Jia, Chuanwei Huang, Jinchao Zhang, Jie Zhou

Published: 2026-02-06 07:27:19+00:00

AI Summary

This paper introduces a novel deepfake detection method, SRI-Net, which exploits inconsistencies in facial specular reflection, a component difficult for AI-generated methods (especially diffusion models) to accurately replicate due to its complex physical laws. It proposes a fast and accurate Retinex-based face texture estimation method to precisely separate specular reflection. SRI-Net then employs a two-stage cross-attention mechanism to capture correlations between specular reflection, face texture, and direct light, integrating these with image features for robust forgery detection.

Abstract

Detecting deepfakes has become increasingly challenging as forgery faces synthesized by AI-generated methods, particularly diffusion models, achieve unprecedented quality and resolution. Existing forgery detection approaches relying on spatial and frequency features demonstrate limited efficacy against high-quality, entirely synthesized forgeries. In this paper, we propose a novel detection method grounded in the observation that facial attributes governed by complex physical laws and multiple parameters are inherently difficult to replicate. Specifically, we focus on illumination, particularly the specular reflection component in the Phong illumination model, which poses the greatest replication challenge due to its parametric complexity and nonlinear formulation. We introduce a fast and accurate face texture estimation method based on Retinex theory to enable precise specular reflection separation. Furthermore, drawing from the mathematical formulation of specular reflection, we posit that forgery evidence manifests not only in the specular reflection itself but also in its relationship with corresponding face texture and direct light. To address this issue, we design the Specular-Reflection-Inconsistency-Network (SRI-Net), incorporating a two-stage cross-attention mechanism to capture these correlations and integrate specular reflection related features with image features for robust forgery detection. Experimental results demonstrate that our method achieves superior performance on both traditional deepfake datasets and generative deepfake datasets, particularly those containing diffusion-generated forgery faces.

Key findings

The proposed SRI-Net achieves superior performance on both traditional deepfake datasets (e.g., FF++, CelebDF, DFD) and challenging generative deepfake datasets (DiFF, DF40), especially those with diffusion-generated faces. Ablation studies confirm the effectiveness of specular reflection as a robust forgery indicator and the benefit of the Retinex-based texture extraction and cross-attention mechanism. The method also demonstrates strong robustness to common post-processing techniques and consistent performance across varying skin tones.

Approach

The approach analyzes facial specular reflection, identified as a strong forgery indicator due to its complex and nonlinear mathematical formulation within the Phong illumination model. It uses a Retinex-based method for accurate face texture and illumination separation, followed by a residual-based approach to extract specular reflection. A Specular-Reflection-Inconsistency-Network (SRI-Net) with a two-stage cross-attention mechanism then captures correlations among specular reflection, face texture, and direct light, fusing these with original image features for the final real/fake classification.

Datasets

FaceForensics++ (FF++), CelebDF v1, CelebDF v2, DeepfakeDetection (DFD), Diffusion Facial Forgery (DiFF), DF40, AI-Face dataset

Model(s)

Specular-Reflection-Inconsistency-Network (SRI-Net) with Xception as backbone, 3DDFA for 3D shape extraction

Author countries

China

← Previous