A Novel Local Focusing Mechanism for Deepfake Detection Generalization

The rapid advancement of deepfake generation techniques has intensified the need for robust and generalizable detection methods. Existing approaches based on reconstruction learning typically leverage deep convolutional networks to extract differential features. However, these methods show poor generalization across object categories (e.g., from faces to cars) and generation domains (e.g., from GANs to Stable Diffusion), due to intrinsic limitations of deep CNNs. First, models trained on a specific category tend to overfit to semantic feature distributions, making them less transferable to other categories, especially as network depth increases. Second, Global Average Pooling (GAP) compresses critical local forgery cues into a single vector, thus discarding discriminative patterns vital for real-fake classification. To address these issues, we propose a novel Local Focus Mechanism (LFM) that explicitly attends to discriminative local features for differentiating fake from real images. LFM integrates a Salience Network (SNet) with a task-specific Top-K Pooling (TKP) module to select the K most informative local patterns. To mitigate potential overfitting introduced by Top-K pooling, we introduce two regularization techniques: Rank-Based Linear Dropout (RBLD) and Random-K Sampling (RKS), which enhance the model's robustness. LFM achieves a 3.7 improvement in accuracy and a 2.8 increase in average precision over the state-of-the-art Neighboring Pixel Relationships (NPR) method, while maintaining exceptional efficiency at 1789 FPS on a single NVIDIA A6000 GPU. Our approach sets a new benchmark for cross-domain deepfake detection. The source code are available in https://github.com/lmlpy/LFM.git

Key Contributions

Local Focus Mechanism (LFM) combining a Salience Network (SNet) with Top-K Pooling (TKP) to capture discriminative local forgery cues discarded by Global Average Pooling
Two regularization techniques — Rank-Based Linear Dropout (RBLD) and Random-K Sampling (RKS) — to prevent overfitting introduced by Top-K selection
Cross-domain generalization across object categories (faces, cars) and generation methods (GANs, Stable Diffusion) with 3.7% accuracy and 2.8% AP gains over SOTA NPR at 1789 FPS

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel deepfake detection architecture (LFM with SNet + TKP) for identifying AI-generated images across generation domains (GANs, Stable Diffusion) and object categories — directly addresses AI-generated content detection, a core ML09 concern.

Details

Domains

vision

Model Types

cnn

Threat Tags

inference_timedigital

Applications

2025 0 cit.

Output Integrity Attack

90%

A Novel Local Focusing Mechanism for Deepfake Detection Generalization

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

RCDN: Real-Centered Detection Network for Robust Face Forgery Identification

ForensicFlow: A Tri-Modal Adaptive Network for Robust Deepfake Detection

A Novel Unified Approach to Deepfake Detection

TwoHead-SwinFPN: A Unified DL Architecture for Synthetic Manipulation, Detection and Localization in Identity Documents

Attack-Aware Deepfake Detection under Counter-Forensic Manipulations

ForensicFormer: Hierarchical Multi-Scale Reasoning for Cross-Domain Image Forgery Detection

Fairness-Aware Deepfake Detection: Leveraging Dual-Mechanism Optimization

Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral Prior