defense 2025

Disruptive Attacks on Face Swapping via Low-Frequency Perceptual Perturbations

Mengxiao Huang , Minglei Shu , Shuwang Zhou , Zhaoyang Liu

0 citations

α

Published on arXiv

2508.20595

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Frequency-domain perturbations significantly reduce face-swapping output quality and naturalness on CelebA-HQ and LFW while keeping the protected source image visually plausible.

Low-Frequency Perceptual Perturbation (LFP)

Novel technique introduced


Deepfake technology, driven by Generative Adversarial Networks (GANs), poses significant risks to privacy and societal security. Existing detection methods are predominantly passive, focusing on post-event analysis without preventing attacks. To address this, we propose an active defense method based on low-frequency perceptual perturbations to disrupt face swapping manipulation, reducing the performance and naturalness of generated content. Unlike prior approaches that used low-frequency perturbations to impact classification accuracy,our method directly targets the generative process of deepfake techniques. We combine frequency and spatial domain features to strengthen defenses. By introducing artifacts through low-frequency perturbations while preserving high-frequency details, we ensure the output remains visually plausible. Additionally, we design a complete architecture featuring an encoder, a perturbation generator, and a decoder, leveraging discrete wavelet transform (DWT) to extract low-frequency components and generate perturbations that disrupt facial manipulation models. Experiments on CelebA-HQ and LFW demonstrate significant reductions in face-swapping effectiveness, improved defense success rates, and preservation of visual quality.


Key Contributions

  • Low-frequency perceptual perturbation framework that targets the generative process of face-swapping models rather than classification accuracy, using DWT to decouple and selectively disrupt low-frequency components while preserving high-frequency details.
  • Integrated encoder–perturbation generator–decoder architecture that jointly optimizes feature extraction, perturbation generation, and image reconstruction for imperceptible yet disruptive adversarial protection.
  • Empirical demonstration on CelebA-HQ and LFW showing significant reductions in face-swapping effectiveness and improved defense success rates against models including SimSwap, InfoSwap, E4S, and UniFace.

🛡️ Threat Analysis

Output Integrity Attack

Proposes a proactive content protection method that prevents convincing deepfake face-swapping outputs from being generated — directly targeting AI-generated content integrity. The perturbations act as an active defense analogous to anti-deepfake image protections, which ML09 explicitly covers. The goal is content authenticity and prevention of synthetic media manipulation, not classifier evasion.


Details

Domains
vision
Model Types
gan
Threat Tags
white_boxinference_timedigital
Datasets
CelebA-HQLFW
Applications
face swappingdeepfake generationfacial privacy protection