A Hybrid Deep Learning and Forensic Approach for Robust Deepfake Detection
Published on arXiv
2510.27392
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
The hybrid model achieves F1-scores of 0.96, 0.82, and 0.77 on FaceForensics++, Celeb-DF v2, and DFDC respectively, outperforming single-method and existing hybrid baselines while maintaining robustness to unseen manipulations.
The rapid evolution of generative adversarial networks (GANs) and diffusion models has made synthetic media increasingly realistic, raising societal concerns around misinformation, identity fraud, and digital trust. Existing deepfake detection methods either rely on deep learning, which suffers from poor generalization and vulnerability to distortions, or forensic analysis, which is interpretable but limited against new manipulation techniques. This study proposes a hybrid framework that fuses forensic features, including noise residuals, JPEG compression traces, and frequency-domain descriptors, with deep learning representations from convolutional neural networks (CNNs) and vision transformers (ViTs). Evaluated on benchmark datasets (FaceForensics++, Celeb-DF v2, DFDC), the proposed model consistently outperformed single-method baselines and demonstrated superior performance compared to existing state-of-the-art hybrid approaches, achieving F1-scores of 0.96, 0.82, and 0.77, respectively. Robustness tests demonstrated stable performance under compression (F1 = 0.87 at QF = 50), adversarial perturbations (AUC = 0.84), and unseen manipulations (F1 = 0.79). Importantly, explainability analysis showed that Grad-CAM and forensic heatmaps overlapped with ground-truth manipulated regions in 82 percent of cases, enhancing transparency and user trust. These findings confirm that hybrid approaches provide a balanced solution, combining the adaptability of deep models with the interpretability of forensic cues, to develop resilient and trustworthy deepfake detection systems.
Key Contributions
- Hybrid fusion framework combining forensic features (noise residuals, JPEG compression traces, frequency-domain descriptors) with CNN and Vision Transformer representations for deepfake detection
- Demonstrated robustness under real-world degradations: F1=0.87 under JPEG compression (QF=50) and AUC=0.84 under adversarial perturbations
- Explainability analysis showing 82% overlap between Grad-CAM / forensic heatmaps and ground-truth manipulated regions, improving interpretability
🛡️ Threat Analysis
The paper's primary contribution is a novel AI-generated content detection architecture. Deepfake detection — verifying whether media is synthetically generated — falls squarely under ML09 output integrity and content authenticity. The paper proposes a new hybrid forensic+DL detection method, not merely applying existing detectors to a domain.