When Deepfake Detection Meets Graph Neural Network:a Unified and Lightweight Learning Framework
Haoyu Liu 1, Chaoyu Gong 1, Mengke He 1, Jiate Li 2,1, Kai Han 3, Siqiang Luo 1
Published on arXiv
2508.05526
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
SSTGNN achieves superior in-domain and cross-domain deepfake detection with up to 42× fewer parameters than state-of-the-art models
SSTGNN
Novel technique introduced
The proliferation of generative video models has made detecting AI-generated and manipulated videos an urgent challenge. Existing detection approaches often fail to generalize across diverse manipulation types due to their reliance on isolated spatial, temporal, or spectral information, and typically require large models to perform well. This paper introduces SSTGNN, a lightweight Spatial-Spectral-Temporal Graph Neural Network framework that represents videos as structured graphs, enabling joint reasoning over spatial inconsistencies, temporal artifacts, and spectral distortions. SSTGNN incorporates learnable spectral filters and spatial-temporal differential modeling into a unified graph-based architecture, capturing subtle manipulation traces more effectively. Extensive experiments on diverse benchmark datasets demonstrate that SSTGNN not only achieves superior performance in both in-domain and cross-domain settings, but also offers strong efficiency and resource allocation. Remarkably, SSTGNN accomplishes these results with up to 42$\times$ fewer parameters than state-of-the-art models, making it highly lightweight and resource-friendly for real-world deployment.
Key Contributions
- SSTGNN: a unified graph-based architecture that jointly models spatial inconsistencies, temporal artifacts, and spectral distortions for deepfake video detection
- Learnable spectral filters and spatial-temporal differential modeling embedded within a GNN framework for capturing subtle manipulation traces
- Achieves up to 42× parameter reduction compared to SOTA models while maintaining superior in-domain and cross-domain detection performance
🛡️ Threat Analysis
Proposes a novel detection architecture for AI-generated and manipulated video content — deepfake detection is explicitly listed under ML09 (output integrity / AI-generated content detection). The contribution is a new forensic method, not mere application of existing detectors to a domain.