Attack-Aware Deepfake Detection under Counter-Forensic Manipulations

This work presents an attack-aware deepfake and image-forensics detector designed for robustness, well-calibrated probabilities, and transparent evidence under realistic deployment conditions. The method combines red-team training with randomized test-time defense in a two-stream architecture, where one stream encodes semantic content using a pretrained backbone and the other extracts forensic residuals, fused via a lightweight residual adapter for classification, while a shallow Feature Pyramid Network style head produces tamper heatmaps under weak supervision. Red-team training applies worst-of-K counter-forensics per batch, including JPEG realign and recompress, resampling warps, denoise-to-regrain operations, seam smoothing, small color and gamma shifts, and social-app transcodes, while test-time defense injects low-cost jitters such as resize and crop phase changes, mild gamma variation, and JPEG phase shifts with aggregated predictions. Heatmaps are guided to concentrate within face regions using face-box masks without strict pixel-level annotations. Evaluation on existing benchmarks, including standard deepfake datasets and a surveillance-style split with low light and heavy compression, reports clean and attacked performance, AUC, worst-case accuracy, reliability, abstention quality, and weak-localization scores. Results demonstrate near-perfect ranking across attacks, low calibration error, minimal abstention risk, and controlled degradation under regrain, establishing a modular, data-efficient, and practically deployable baseline for attack-aware detection with calibrated probabilities and actionable heatmaps.

Key Contributions

Two-stream architecture fusing semantic content (pretrained backbone) and forensic residuals via a lightweight residual adapter, with an FPN-style head producing weakly supervised tamper heatmaps guided by face-box masks
Red-team training with worst-of-K counter-forensic augmentation (JPEG realign/recompress, resampling warps, denoise-to-regrain, seam smoothing, color/gamma shifts, social-app transcodes) to harden the detector against realistic field manipulations
Randomized test-time defense with low-cost jitters (resize/crop phase, mild gamma, JPEG phase shifts) and prediction aggregation, paired with calibration and abstention diagnostics for reliable deployment

🛡️ Threat Analysis

Output Integrity Attack

Primary contribution is a novel deepfake detection system (AI-generated content detection) explicitly designed to withstand counter-forensic manipulations (JPEG recompression, resampling, denoise-to-regrain, social-app transcodes) that degrade or fool the detector, along with weakly supervised tamper heatmaps for output interpretability — squarely within output integrity and AI-generated content detection.

Details

Domains

vision

Model Types

transformercnn

Threat Tags

inference_timedigital

Applications

2025 0 cit.

Output Integrity Attack

100%

Attack-Aware Deepfake Detection under Counter-Forensic Manipulations

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

ForensicFormer: Hierarchical Multi-Scale Reasoning for Cross-Domain Image Forgery Detection

Fairness-Aware Deepfake Detection: Leveraging Dual-Mechanism Optimization

A Novel Unified Approach to Deepfake Detection

Phase4DFD: Multi-Domain Phase-Aware Attention for Deepfake Detection

StegaFFD: Privacy-Preserving Face Forgery Detection via Fine-Grained Steganographic Domain Lifting

A Spatial-Frequency Aware Multi-Scale Fusion Network for Real-Time Deepfake Detection

Beyond Flicker: Detecting Kinematic Inconsistencies for Generalizable Deepfake Video Detection

ForensicFlow: A Tri-Modal Adaptive Network for Robust Deepfake Detection