defense 2025

A Dual-Branch CNN for Robust Detection of AI-Generated Facial Forgeries

Xin Zhang 1, Yuqi Song 1, Fei Zuo 2

0 citations · 30 references · International Conference on Cy...

α

Published on arXiv

2510.24640

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Proposed dual-branch CNN outperforms existing detectors on all four DiFF forgery categories and exceeds average human accuracy (which ranged from 45.53% to 59.65% depending on forgery type)

FSC Loss

Novel technique introduced


The rapid advancement of generative AI has enabled the creation of highly realistic forged facial images, posing significant threats to AI security, digital media integrity, and public trust. Face forgery techniques, ranging from face swapping and attribute editing to powerful diffusion-based image synthesis, are increasingly being used for malicious purposes such as misinformation, identity fraud, and defamation. This growing challenge underscores the urgent need for robust and generalizable face forgery detection methods as a critical component of AI security infrastructure. In this work, we propose a novel dual-branch convolutional neural network for face forgery detection that leverages complementary cues from both spatial and frequency domains. The RGB branch captures semantic information, while the frequency branch focuses on high-frequency artifacts that are difficult for generative models to suppress. A channel attention module is introduced to adaptively fuse these heterogeneous features, highlighting the most informative channels for forgery discrimination. To guide the network's learning process, we design a unified loss function, FSC Loss, that combines focal loss, supervised contrastive loss, and a frequency center margin loss to enhance class separability and robustness. We evaluate our model on the DiFF benchmark, which includes forged images generated from four representative methods: text-to-image, image-to-image, face swap, and face edit. Our method achieves strong performance across all categories and outperforms average human accuracy. These results demonstrate the model's effectiveness and its potential contribution to safeguarding AI ecosystems against visual forgery attacks.


Key Contributions

  • Dual-branch ResNet architecture combining RGB (spatial/semantic) and frequency-domain branches with channel attention fusion for complementary forgery cue extraction
  • Novel FSC Loss integrating focal loss, supervised contrastive loss, and frequency center margin loss to improve class separability and robustness
  • Strong performance on DiFF benchmark across all four forgery categories (T2I, I2I, face swap, face edit), surpassing average human accuracy

🛡️ Threat Analysis

Output Integrity Attack

Core contribution is a novel deepfake/face forgery detection architecture for verifying output integrity and authenticity of AI-generated facial images — canonical ML09 (AI-generated content detection).


Details

Domains
visiongenerative
Model Types
cnndiffusiongan
Threat Tags
inference_timedigital
Datasets
DiFF
Applications
face forgery detectiondeepfake detection