defense 2025

SFANet: Spatial-Frequency Attention Network for Deepfake Detection

Vrushank Ahire 1, Aniruddh Muley 1, Shivam Zample 1, Siddharth Verma 1, Pranav Menon 1, Surbhi Madan 1, Abhinav Dhall 2

0 citations · 28 references · arXiv

α

Published on arXiv

2510.04630

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Achieves state-of-the-art performance on DFWild-Cup, a diverse benchmark spanning eight deepfake datasets, through hybrid transformer and texture-based ensemble learning.

SFANet

Novel technique introduced


Detecting manipulated media has now become a pressing issue with the recent rise of deepfakes. Most existing approaches fail to generalize across diverse datasets and generation techniques. We thus propose a novel ensemble framework, combining the strengths of transformer-based architectures, such as Swin Transformers and ViTs, and texture-based methods, to achieve better detection accuracy and robustness. Our method introduces innovative data-splitting, sequential training, frequency splitting, patch-based attention, and face segmentation techniques to handle dataset imbalances, enhance high-impact regions (e.g., eyes and mouth), and improve generalization. Our model achieves state-of-the-art performance when tested on the DFWild-Cup dataset, a diverse subset of eight deepfake datasets. The ensemble benefits from the complementarity of these approaches, with transformers excelling in global feature extraction and texturebased methods providing interpretability. This work demonstrates that hybrid models can effectively address the evolving challenges of deepfake detection, offering a robust solution for real-world applications.


Key Contributions

  • Ensemble framework combining Swin Transformers, ViTs, and texture-based (LBP/FFT) methods for complementary global and local feature extraction
  • Novel pipeline techniques including frequency splitting, patch-based attention on high-impact facial regions (eyes, mouth), and face segmentation to improve generalization
  • Sequential training and data-splitting strategies to handle dataset imbalance across diverse deepfake generation methods

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel AI-generated content detection architecture targeting deepfake images/videos — directly addresses output integrity by distinguishing authentic from AI-manipulated media across diverse generation techniques.


Details

Domains
vision
Model Types
transformercnn
Threat Tags
inference_time
Datasets
DFWild-Cup
Applications
deepfake detectionmanipulated media detection