ID-Eraser: Proactive Defense Against Face Swapping via Identity Perturbation

Deepfake technologies have rapidly advanced with modern generative AI, and face swapping in particular poses serious threats to privacy and digital security. Existing proactive defenses mostly rely on pixel-level perturbations, which are ineffective against contemporary swapping models that extract robust high-level identity embeddings. We propose ID-Eraser, a feature-space proactive defense that removes identifiable facial information to prevent malicious face swapping. By injecting learnable perturbations into identity embeddings and reconstructing natural-looking protection images through a Face Revive Generator (FRG), ID-Eraser produces visually realistic results for humans while rendering the protected identities unusable for Deepfake models. Experiments show that ID-Eraser substantially disrupts identity recognition across diverse face recognition and swapping systems under strict black-box settings, achieving the lowest Top-1 accuracy (0.30) with the best FID (1.64) and LPIPS (0.020). Compared with swaps generated from clean inputs, the identity similarity of protected swaps drops sharply to an average of 0.504 across five representative face swapping models. ID-Eraser further demonstrates strong cross-dataset generalization, robustness to common distortions, and practical effectiveness on commercial APIs, reducing Tencent API similarity from 0.76 to 0.36.

Key Contributions

First feature-level proactive defense against face swapping via identity embedding perturbation rather than pixel-level perturbations
Feature Perturbation Module (FPM) and Face Revive Generator (FRG) framework that produces visually realistic protected images while disrupting identity recognition
Strong cross-model generalization and robustness, reducing identity similarity from clean inputs to 0.504 average across five face swapping models and commercial APIs

🛡️ Threat Analysis

Input Manipulation Attack

ID-Eraser is a defense against face swapping attacks, which are adversarial manipulations at inference time. The defense works by injecting perturbations into identity feature space to cause face swapping models to fail at extracting valid identity embeddings. While the defense operates in feature space rather than pixel space, it is fundamentally protecting against adversarial manipulation of model inputs/outputs (face swapping is an evasion attack where the attacker manipulates identity features to generate forged content). The paper explicitly positions this as defense against deepfake attacks that exploit face recognition and swapping systems.

Details

Domains

visiongenerative

Model Types

cnngan

Threat Tags

black_boxinference_timedigital

Datasets

CelebA-HQFFHQ

Applications

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Make Identity Unextractable yet Perceptible: Synthesis-Based Privacy Protection for Subject Faces in Photos

SIDeR: Semantic Identity Decoupling for Unrestricted Face Privacy

Machine Pareidolia: Protecting Facial Image with Emotional Editing

Architecture-Agnostic Feature Synergy for Universal Defense Against Heterogeneous Generative Threats

Diffusion-Guided Adversarial Perturbation Injection for Generalizable Defense Against Facial Manipulations

AEGIS: Preserving privacy of 3D Facial Avatars with Adversarial Perturbations

Attack Assessment and Augmented Identity Recognition for Human Skeleton Data

RobPI: Robust Private Inference against Malicious Client