Safeguarding Facial Identity against Diffusion-based Face Swapping via Cascading Pathway Disruption

The rapid evolution of diffusion models has democratized face swapping but also raises concerns about privacy and identity security. Existing proactive defenses, often adapted from image editing attacks, prove ineffective in this context. We attribute this failure to an oversight of the structural resilience and the unique static conditional guidance mechanism inherent in face swapping systems. To address this, we propose VoidFace, a systemic defense method that views face swapping as a coupled identity pathway. By injecting perturbations at critical bottlenecks, VoidFace induces cascading disruption throughout the pipeline. Specifically, we first introduce localization disruption and identity erasure to degrade physical regression and semantic embeddings, thereby impairing the accurate modeling of the source face. We then intervene in the generative domain by decoupling attention mechanisms to sever identity injection, and corrupting intermediate diffusion features to prevent the reconstruction of source identity. To ensure visual imperceptibility, we perform adversarial search in the latent manifold, guided by a perceptual adaptive strategy to balance attack potency with image quality. Extensive experiments show that VoidFace outperforms existing defenses across various diffusion-based swapping models, while producing adversarial faces with superior visual quality.

Key Contributions

VoidFace framework that models face swapping as a coupled identity pathway and injects perturbations at critical bottlenecks to induce cascading disruption across the entire pipeline
Localization disruption and identity erasure techniques to degrade physical regression and semantic embeddings, impairing accurate source face modeling
Latent manifold adversarial search with perceptual adaptive strategy to balance attack potency with visual imperceptibility of protected face images

🛡️ Threat Analysis

Input Manipulation Attack

Core technique is crafting adversarial perturbations via latent manifold search that disrupt the face swapping model's identity extraction and injection pipeline at inference time — this is adversarial input manipulation causing the target model to fail its intended function.

Output Integrity Attack

Application goal is protecting facial identity against AI-generated content (diffusion-based face swapping / deepfakes); the adversarial perturbations function as anti-deepfake content protections, directly targeting output integrity of AI-generated face content.