Exploring Specular Reflection Inconsistency for Generalizable Face Forgery Detection
Hongyan Fei 1,1, Zexi Jia 2, Chuanwei Huang 1,1, Jinchao Zhang 2, Jie Zhou 2
Published on arXiv
2602.06452
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Achieves superior detection performance on both traditional deepfake datasets and diffusion-model-generated forgery face datasets compared to spatial and frequency-domain baselines.
SRI-Net
Novel technique introduced
Detecting deepfakes has become increasingly challenging as forgery faces synthesized by AI-generated methods, particularly diffusion models, achieve unprecedented quality and resolution. Existing forgery detection approaches relying on spatial and frequency features demonstrate limited efficacy against high-quality, entirely synthesized forgeries. In this paper, we propose a novel detection method grounded in the observation that facial attributes governed by complex physical laws and multiple parameters are inherently difficult to replicate. Specifically, we focus on illumination, particularly the specular reflection component in the Phong illumination model, which poses the greatest replication challenge due to its parametric complexity and nonlinear formulation. We introduce a fast and accurate face texture estimation method based on Retinex theory to enable precise specular reflection separation. Furthermore, drawing from the mathematical formulation of specular reflection, we posit that forgery evidence manifests not only in the specular reflection itself but also in its relationship with corresponding face texture and direct light. To address this issue, we design the Specular-Reflection-Inconsistency-Network (SRI-Net), incorporating a two-stage cross-attention mechanism to capture these correlations and integrate specular reflection related features with image features for robust forgery detection. Experimental results demonstrate that our method achieves superior performance on both traditional deepfake datasets and generative deepfake datasets, particularly those containing diffusion-generated forgery faces.
Key Contributions
- Identifies specular reflection (Phong illumination model) as a generalizable forgery indicator due to its parametric complexity and nonlinearity.
- Introduces a Retinex-theory-based face texture estimation method for fast, accurate specular reflection extraction.
- Designs SRI-Net with two-stage cross-attention to exploit relationships among specular reflection, face texture, and direct light for robust deepfake detection.
🛡️ Threat Analysis
Proposes a detection method for AI-generated/deepfake face images — this is AI-generated content detection, which falls squarely under output integrity and content authenticity.