A Spatial-Frequency Aware Multi-Scale Fusion Network for Real-Time Deepfake Detection

With the rapid advancement of real-time deepfake generation techniques, forged content is becoming increasingly realistic and widespread across applications like video conferencing and social media. Although state-of-the-art detectors achieve high accuracy on standard benchmarks, their heavy computational cost hinders real-time deployment in practical applications. To address this, we propose the Spatial-Frequency Aware Multi-Scale Fusion Network (SFMFNet), a lightweight yet effective architecture for real-time deepfake detection. We design a spatial-frequency hybrid aware module that jointly leverages spatial textures and frequency artifacts through a gated mechanism, enhancing sensitivity to subtle manipulations. A token-selective cross attention mechanism enables efficient multi-level feature interaction, while a residual-enhanced blur pooling structure helps retain key semantic cues during downsampling. Experiments on several benchmark datasets show that SFMFNet achieves a favorable balance between accuracy and efficiency, with strong generalization and practical value for real-time applications.

Key Contributions

Spatial-frequency hybrid aware module fusing wavelet features and coordinate attention via a dynamic gating map to enhance forgery region perception
Token-selective cross attention module for efficient cross-scale feature interaction and forgery feature alignment
Residual downsampling module based on blur pooling to preserve structural and edge details while reducing aliasing

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel deepfake detection architecture that verifies content integrity by identifying AI-generated or manipulated video content — a core output integrity and content authenticity problem.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

inference_timedigital

Datasets

FaceForensics++Celeb-DF v2

Applications

2026 0 cit.

Output Integrity Attack

100%

A Spatial-Frequency Aware Multi-Scale Fusion Network for Real-Time Deepfake Detection

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Morphology-optimized Multi-Scale Fusion: Combining Local Artifacts and Mesoscopic Semantics for Deepfake Detection and Localization

A Novel Unified Approach to Deepfake Detection

Phase4DFD: Multi-Domain Phase-Aware Attention for Deepfake Detection

Fairness-Aware Deepfake Detection: Leveraging Dual-Mechanism Optimization

ForensicFormer: Hierarchical Multi-Scale Reasoning for Cross-Domain Image Forgery Detection

ForensicFlow: A Tri-Modal Adaptive Network for Robust Deepfake Detection

Attack-Aware Deepfake Detection under Counter-Forensic Manipulations

StegaFFD: Privacy-Preserving Face Forgery Detection via Fine-Grained Steganographic Domain Lifting