DeepShield: Fortifying Deepfake Video Detection with Local and Global Forgery Analysis
Yinqi Cai 1, Jichang Li , Zhaolun Li 1, Weikai Chen 2,3,4, Rushi Lan 2, Xi Xie 2, Xiaonan Luo 2,3, Guanbin Li 3
1 Guilin University of Electronic Technology
4 Guangdong Key Laboratory of Big Data Analysis and Processing
Published on arXiv
2510.25237
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Outperforms state-of-the-art methods in cross-dataset and cross-manipulation evaluations, demonstrating superior generalization to unseen deepfake techniques.
DeepShield
Novel technique introduced
Recent advances in deep generative models have made it easier to manipulate face videos, raising significant concerns about their potential misuse for fraud and misinformation. Existing detectors often perform well in in-domain scenarios but fail to generalize across diverse manipulation techniques due to their reliance on forgery-specific artifacts. In this work, we introduce DeepShield, a novel deepfake detection framework that balances local sensitivity and global generalization to improve robustness across unseen forgeries. DeepShield enhances the CLIP-ViT encoder through two key components: Local Patch Guidance (LPG) and Global Forgery Diversification (GFD). LPG applies spatiotemporal artifact modeling and patch-wise supervision to capture fine-grained inconsistencies often overlooked by global models. GFD introduces domain feature augmentation, leveraging domain-bridging and boundary-expanding feature generation to synthesize diverse forgeries, mitigating overfitting and enhancing cross-domain adaptability. Through the integration of novel local and global analysis for deepfake detection, DeepShield outperforms state-of-the-art methods in cross-dataset and cross-manipulation evaluations, achieving superior robustness against unseen deepfake attacks. Code is available at https://github.com/lijichang/DeepShield.
Key Contributions
- DeepShield framework enhancing CLIP-ViT for deepfake detection with dual local/global analysis components
- Local Patch Guidance (LPG): spatiotemporal artifact modeling with patch-wise supervision to capture fine-grained forgery inconsistencies
- Global Forgery Diversification (GFD): domain-bridging and boundary-expanding feature augmentation to improve cross-domain generalization
🛡️ Threat Analysis
DeepShield is an AI-generated content detection system specifically targeting deepfake videos — deepfake detection is explicitly listed under ML09 (output integrity / content provenance). The paper proposes a novel detection architecture rather than merely applying existing methods to a domain.