Phase4DFD: Multi-Domain Phase-Aware Attention for Deepfake Detection
Zhen-Xin Lin , Shang-Kuan Chen
Published on arXiv
2601.05861
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Phase4DFD outperforms state-of-the-art spatial and frequency-based deepfake detectors on CIFAKE and DFFD, with ablations confirming that explicit phase modeling provides complementary, non-redundant information beyond magnitude-only representations.
Phase4DFD
Novel technique introduced
Recent deepfake detection methods have increasingly explored frequency domain representations to reveal manipulation artifacts that are difficult to detect in the spatial domain. However, most existing approaches rely primarily on spectral magnitude, implicitly under exploring the role of phase information. In this work, we propose Phase4DFD, a phase aware frequency domain deepfake detection framework that explicitly models phase magnitude interactions via a learnable attention mechanism. Our approach augments standard RGB input with Fast Fourier Transform (FFT) magnitude and local binary pattern (LBP) representations to expose subtle synthesis artifacts that remain indistinguishable under spatial analysis alone. Crucially, we introduce an input level phase aware attention module that uses phase discontinuities commonly introduced by synthetic generation to guide the model toward frequency patterns that are most indicative of manipulation before backbone feature extraction. The attended multi domain representation is processed by an efficient BNext M backbone, with optional channel spatial attention applied for semantic feature refinement. Extensive experiments on the CIFAKE and DFFD datasets demonstrate that our proposed model Phase4DFD outperforms state of the art spatial and frequency-based detectors while maintaining low computational overhead. Comprehensive ablation studies further confirm that explicit phase modeling provides complementary and non-redundant information beyond magnitude-only frequency representations.
Key Contributions
- Input-level phase-aware attention module that leverages phase discontinuities introduced by synthetic generation to guide the network toward manipulation-indicative frequency patterns before backbone feature extraction
- Multi-domain input representation combining RGB, FFT magnitude, and Local Binary Pattern (LBP) features to expose synthesis artifacts invisible in the spatial domain alone
- Efficient integration with BNext-M backbone achieving state-of-the-art deepfake detection on CIFAKE and DFFD while maintaining low computational overhead
🛡️ Threat Analysis
Proposes a deepfake/synthetic image detection framework — squarely within ML09's scope of AI-generated content detection. The primary contribution is a novel detection architecture (phase-aware attention + multi-domain fusion) evaluated against state-of-the-art detectors, not merely an application of existing methods.