defense 2026

Beauty and the Beast: Imperceptible Perturbations Against Diffusion-Based Face Swapping via Directional Attribute Editing

Yilong Huang , Songze Li

0 citations · 61 references · arXiv

α

Published on arXiv

2601.22744

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

FaceDefense achieves a superior imperceptibility–defense effectiveness trade-off over existing proactive defense methods (including MyFace) across all tested perturbation budgets against diffusion-based face swapping

FaceDefense

Novel technique introduced


Diffusion-based face swapping achieves state-of-the-art performance, yet it also exacerbates the potential harm of malicious face swapping to violate portraiture right or undermine personal reputation. This has spurred the development of proactive defense methods. However, existing approaches face a core trade-off: large perturbations distort facial structures, while small ones weaken protection effectiveness. To address these issues, we propose FaceDefense, an enhanced proactive defense framework against diffusion-based face swapping. Our method introduces a new diffusion loss to strengthen the defensive efficacy of adversarial examples, and employs a directional facial attribute editing to restore perturbation-induced distortions, thereby enhancing visual imperceptibility. A two-phase alternating optimization strategy is designed to generate final perturbed face images. Extensive experiments show that FaceDefense significantly outperforms existing methods in both imperceptibility and defense effectiveness, achieving a superior trade-off.


Key Contributions

  • Identifies that latent-space perturbations in LDMs preferentially distort high-level facial semantics (eyes, nose) while leaving compressed regions (hair, background) intact, explaining visible facial degradation in prior methods
  • Proposes FaceDefense, which employs directional facial attribute editing in the W+ space to restore perturbation-induced facial distortions, improving imperceptibility without sacrificing defense strength
  • Introduces a two-phase alternating optimization strategy that jointly minimizes a novel diffusion loss and an attribute-editing restoration objective to achieve a superior imperceptibility–effectiveness trade-off

🛡️ Threat Analysis

Output Integrity Attack

Defends against malicious AI-generated content (deepfake face swapping) by crafting protective perturbations that corrupt diffusion model outputs — a proactive anti-deepfake content integrity defense. Per the defense tagging rule, the category reflects the threat being defended against: unauthorized AI-generated synthetic faces.


Details

Domains
visiongenerative
Model Types
diffusiongan
Threat Tags
white_boxinference_timedigital
Applications
face swapping protectiondeepfake preventionportraiture right protection