Machine Pareidolia: Protecting Facial Image with Emotional Editing

The proliferation of facial recognition (FR) systems has raised privacy concerns in the digital realm, as malicious uses of FR models pose a significant threat. Traditional countermeasures, such as makeup style transfer, have suffered from low transferability in black-box settings and limited applicability across various demographic groups, including males and individuals with darker skin tones. To address these challenges, we introduce a novel facial privacy protection method, dubbed \textbf{MAP}, a pioneering approach that employs human emotion modifications to disguise original identities as target identities in facial images. Our method uniquely fine-tunes a score network to learn dual objectives, target identity and human expression, which are jointly optimized through gradient projection to ensure convergence at a shared local optimum. Additionally, we enhance the perceptual quality of protected images by applying local smoothness regularization and optimizing the score matching loss within our network. Empirical experiments demonstrate that our innovative approach surpasses previous baselines, including noise-based, makeup-based, and freeform attribute methods, in both qualitative fidelity and quantitative metrics. Furthermore, MAP proves its effectiveness against an online FR API and shows advanced adaptability in uncommon photographic scenarios.

Key Contributions

Novel emotional-editing-based adversarial facial protection method (MAP) using a fine-tuned score network that jointly optimizes target identity and expression objectives via gradient projection
Local smoothness regularization and score matching loss optimization to improve perceptual quality of protected images
Demonstrated effectiveness against online FR APIs and across underrepresented demographic groups (males, darker skin tones)

🛡️ Threat Analysis

Input Manipulation Attack

MAP crafts semantically meaningful adversarial inputs (emotionally modified faces) that cause facial recognition models to misidentify the subject as a target identity at inference time — this is an adversarial evasion attack deployed as a privacy defense. Tested in black-box settings against commercial FR APIs.