ID-Eraser: Proactive Defense Against Face Swapping via Identity Perturbation
Junyan Luo 1, Peipeng Yu 1, Jianwei Fei 2, Shiya Zeng 1, Xiaoyu Zhou 1, Zhihua Xia 1, Xiang Liu 3
Published on arXiv
2604.21465
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
Achieves lowest Top-1 face recognition accuracy (0.30) with best visual quality (FID 1.64, LPIPS 0.020), reducing identity similarity to 0.504 across five face swapping models and dropping Tencent API similarity from 0.76 to 0.36
ID-Eraser
Novel technique introduced
Deepfake technologies have rapidly advanced with modern generative AI, and face swapping in particular poses serious threats to privacy and digital security. Existing proactive defenses mostly rely on pixel-level perturbations, which are ineffective against contemporary swapping models that extract robust high-level identity embeddings. We propose ID-Eraser, a feature-space proactive defense that removes identifiable facial information to prevent malicious face swapping. By injecting learnable perturbations into identity embeddings and reconstructing natural-looking protection images through a Face Revive Generator (FRG), ID-Eraser produces visually realistic results for humans while rendering the protected identities unusable for Deepfake models. Experiments show that ID-Eraser substantially disrupts identity recognition across diverse face recognition and swapping systems under strict black-box settings, achieving the lowest Top-1 accuracy (0.30) with the best FID (1.64) and LPIPS (0.020). Compared with swaps generated from clean inputs, the identity similarity of protected swaps drops sharply to an average of 0.504 across five representative face swapping models. ID-Eraser further demonstrates strong cross-dataset generalization, robustness to common distortions, and practical effectiveness on commercial APIs, reducing Tencent API similarity from 0.76 to 0.36.
Key Contributions
- First feature-level proactive defense against face swapping via identity embedding perturbation rather than pixel-level perturbations
- Feature Perturbation Module (FPM) and Face Revive Generator (FRG) framework that produces visually realistic protected images while disrupting identity recognition
- Strong cross-model generalization and robustness, reducing identity similarity from clean inputs to 0.504 average across five face swapping models and commercial APIs
🛡️ Threat Analysis
ID-Eraser is a defense against face swapping attacks, which are adversarial manipulations at inference time. The defense works by injecting perturbations into identity feature space to cause face swapping models to fail at extracting valid identity embeddings. While the defense operates in feature space rather than pixel space, it is fundamentally protecting against adversarial manipulation of model inputs/outputs (face swapping is an evasion attack where the attacker manipulates identity features to generate forged content). The paper explicitly positions this as defense against deepfake attacks that exploit face recognition and swapping systems.