Beauty and the Beast: Imperceptible Perturbations Against Diffusion-Based Face Swapping via Directional Attribute Editing

Diffusion-based face swapping achieves state-of-the-art performance, yet it also exacerbates the potential harm of malicious face swapping to violate portraiture right or undermine personal reputation. This has spurred the development of proactive defense methods. However, existing approaches face a core trade-off: large perturbations distort facial structures, while small ones weaken protection effectiveness. To address these issues, we propose FaceDefense, an enhanced proactive defense framework against diffusion-based face swapping. Our method introduces a new diffusion loss to strengthen the defensive efficacy of adversarial examples, and employs a directional facial attribute editing to restore perturbation-induced distortions, thereby enhancing visual imperceptibility. A two-phase alternating optimization strategy is designed to generate final perturbed face images. Extensive experiments show that FaceDefense significantly outperforms existing methods in both imperceptibility and defense effectiveness, achieving a superior trade-off.

Key Contributions

Identifies that latent-space perturbations in LDMs preferentially distort high-level facial semantics (eyes, nose) while leaving compressed regions (hair, background) intact, explaining visible facial degradation in prior methods
Proposes FaceDefense, which employs directional facial attribute editing in the W+ space to restore perturbation-induced facial distortions, improving imperceptibility without sacrificing defense strength
Introduces a two-phase alternating optimization strategy that jointly minimizes a novel diffusion loss and an attribute-editing restoration objective to achieve a superior imperceptibility–effectiveness trade-off

🛡️ Threat Analysis

Output Integrity Attack

Defends against malicious AI-generated content (deepfake face swapping) by crafting protective perturbations that corrupt diffusion model outputs — a proactive anti-deepfake content integrity defense. Per the defense tagging rule, the category reflects the threat being defended against: unauthorized AI-generated synthetic faces.

Details

Domains

visiongenerative

Model Types

diffusiongan

Threat Tags

white_boxinference_timedigital

Applications

2025 0 cit.

Output Integrity Attack

92%