SpecXNet: A Dual-Domain Convolutional Network for Robust Deepfake Detection
Inzamamul Alam , Md Tanvir Islam , Simon S. Woo
Published on arXiv
2509.22070
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Achieves state-of-the-art deepfake detection accuracy on cross-dataset and unseen manipulation benchmarks while maintaining real-time feasibility.
SpecXNet
Novel technique introduced
The increasing realism of content generated by GANs and diffusion models has made deepfake detection significantly more challenging. Existing approaches often focus solely on spatial or frequency-domain features, limiting their generalization to unseen manipulations. We propose the Spectral Cross-Attentional Network (SpecXNet), a dual-domain architecture for robust deepfake detection. The core \textbf{Dual-Domain Feature Coupler (DDFC)} decomposes features into a local spatial branch for capturing texture-level anomalies and a global spectral branch that employs Fast Fourier Transform to model periodic inconsistencies. This dual-domain formulation allows SpecXNet to jointly exploit localized detail and global structural coherence, which are critical for distinguishing authentic from manipulated images. We also introduce the \textbf{Dual Fourier Attention (DFA)} module, which dynamically fuses spatial and spectral features in a content-aware manner. Built atop a modified XceptionNet backbone, we embed the DDFC and DFA modules within a separable convolution block. Extensive experiments on multiple deepfake benchmarks show that SpecXNet achieves state-of-the-art accuracy, particularly under cross-dataset and unseen manipulation scenarios, while maintaining real-time feasibility. Our results highlight the effectiveness of unified spatial-spectral learning for robust and generalizable deepfake detection. To ensure reproducibility, we released the full code on \href{https://github.com/inzamamulDU/SpecXNet}{\textcolor{blue}{\textbf{GitHub}}}.
Key Contributions
- Dual-Domain Feature Coupler (DDFC) that jointly processes local spatial texture anomalies and global spectral inconsistencies via FFT
- Dual Fourier Attention (DFA) module for content-aware dynamic fusion of spatial and spectral features
- Modified XceptionNet backbone integrating DDFC and DFA achieving state-of-the-art cross-dataset deepfake detection
🛡️ Threat Analysis
Proposes a deepfake detection system to verify authenticity of AI-generated images from GANs and diffusion models — directly addresses output integrity and AI-generated content detection.