FakeChain: Exposing Shallow Cues in Multi-Step Deepfake Detection

Multi-step or hybrid deepfakes, created by sequentially applying different deepfake creation methods such as Face-Swapping, GAN-based generation, and Diffusion methods, can pose an emerging and unforseen technical challenge for detection models trained on single-step forgeries. While prior studies have mainly focused on detecting isolated single manipulation, little is known about the detection model behavior under such compositional, hybrid, and complex manipulation pipelines. In this work, we introduce \textbf{FakeChain}, a large-scale benchmark comprising 1-, 2-, and 3-Step forgeries synthesized using five state-of-the-art representative generators. Using this approach, we analyze detection performance and spectral properties across hybrid manipulation at different step, along with varying generator combinations and quality settings. Surprisingly, our findings reveal that detection performance highly depends on the final manipulation type, with F1-score dropping by up to \textbf{58.83\%} when it differs from training distribution. This clearly demonstrates that detectors rely on last-stage artifacts rather than cumulative manipulation traces, limiting generalization. Such findings highlight the need for detection models to explicitly consider manipulation history and sequences. Our results highlight the importance of benchmarks such as FakeChain, reflecting growing synthesis complexity and diversity in real-world scenarios. Our sample code is available here\footnote{https://github.com/minjihh/FakeChain}.

Key Contributions

FakeChain large-scale benchmark with 1-, 2-, and 3-step hybrid deepfakes synthesized using five state-of-the-art generators across varying quality settings
Empirical finding that deepfake detectors exhibit a 'last-stage bias' — relying on final manipulation artifacts rather than cumulative traces, causing F1 drops of up to 58.83% under distribution shift
Spectral analysis revealing differential compression robustness: attention-based models (MAT) are more sensitive to JPEG degradation than CNN-based models (Xception)

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses AI-generated content detection (deepfake detection) — proposes FakeChain benchmark to evaluate and expose failure modes of detectors on multi-step/hybrid deepfakes combining face-swapping, GAN, and diffusion methods.

Details

Domains

visiongenerative

Model Types

cnntransformergandiffusion

Threat Tags

inference_timedigital

Datasets

FakeChain

Applications

2026 0 cit.

Output Integrity Attack

87%

FakeChain: Exposing Shallow Cues in Multi-Step Deepfake Detection

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Além do Desempenho: Um Estudo da Confiabilidade de Detectores de Deepfakes

Deepfake Synthesis vs. Detection: An Uneven Contest

AI-Generated Image Detection: An Empirical Study and Future Research Directions

VTONGuard: Automatic Detection and Authentication of AI-Generated Virtual Try-On Content

NeXT-IMDL: Build Benchmark for NeXT-Generation Image Manipulation Detection & Localization

Detecting Localized Deepfakes: How Well Do Synthetic Image Detectors Handle Inpainting?

SimLBR: Learning to Detect Fake Images by Learning to Detect Real Images

AI-Generated Image Detectors Overrely on Global Artifacts: Evidence from Inpainting Exchange