defense 2025

ForensicFlow: A Tri-Modal Adaptive Network for Robust Deepfake Detection

Mohammad Romani

0 citations · 9 references · arXiv

α

Published on arXiv

2511.14554

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

ForensicFlow achieves AUC 0.9752 and F1 0.9408 on Celeb-DF(v2), outperforming single-stream detectors through multi-domain branch fusion.

ForensicFlow

Novel technique introduced


Modern deepfakes evade detection by leaving subtle, domain-speci c artifacts that single branch networks miss. ForensicFlow addresses this by fusing evidence across three forensic dimensions: global visual inconsistencies (via ConvNeXt-tiny), ne-grained texture anomalies (via Swin Transformer-tiny), and spectral noise patterns (via CNN with channel attention). Our attention-based temporal pooling dynamically prioritizes high-evidence frames, while adaptive fusion weights each branch according to forgery type. Trained on CelebDF(v2) with Focal Loss, the model achieves AUC 0.9752, F1 0.9408, and accuracy 0.9208 out performing single-stream detectors. Ablation studies con rm branch synergy, and Grad-CAM visualizations validate focus on genuine manipulation regions (e.g., facial boundaries). This multi-domain fusion strategy establishes robustness against increasingly sophisticated forgeries.


Key Contributions

  • Tri-modal forensic architecture combining ConvNeXt-tiny (global visual inconsistencies), Swin Transformer-tiny (fine-grained texture anomalies), and a frequency-domain CNN with channel attention (spectral noise patterns)
  • Attention-based temporal pooling that dynamically prioritizes high-evidence frames and adaptive branch fusion weighted by forgery type
  • Achieves AUC 0.9752, F1 0.9408, and accuracy 0.9208 on Celeb-DF(v2), with Grad-CAM confirming focus on genuine manipulation regions

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel forensic detection architecture for AI-generated/manipulated video (deepfakes), directly addressing output integrity and authenticity of synthetic media — the canonical ML09 use case.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
inference_timedigital
Datasets
Celeb-DF(v2)
Applications
deepfake detectionvideo forensicssynthetic face detection