Machine Pareidolia: Protecting Facial Image with Emotional Editing
Published on arXiv
2603.03665
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
MAP outperforms noise-based, makeup-based, and freeform attribute baselines in both qualitative fidelity and quantitative metrics, and successfully evades a commercial FR API in black-box settings
MAP (Machine Pareidolia)
Novel technique introduced
The proliferation of facial recognition (FR) systems has raised privacy concerns in the digital realm, as malicious uses of FR models pose a significant threat. Traditional countermeasures, such as makeup style transfer, have suffered from low transferability in black-box settings and limited applicability across various demographic groups, including males and individuals with darker skin tones. To address these challenges, we introduce a novel facial privacy protection method, dubbed \textbf{MAP}, a pioneering approach that employs human emotion modifications to disguise original identities as target identities in facial images. Our method uniquely fine-tunes a score network to learn dual objectives, target identity and human expression, which are jointly optimized through gradient projection to ensure convergence at a shared local optimum. Additionally, we enhance the perceptual quality of protected images by applying local smoothness regularization and optimizing the score matching loss within our network. Empirical experiments demonstrate that our innovative approach surpasses previous baselines, including noise-based, makeup-based, and freeform attribute methods, in both qualitative fidelity and quantitative metrics. Furthermore, MAP proves its effectiveness against an online FR API and shows advanced adaptability in uncommon photographic scenarios.
Key Contributions
- Novel emotional-editing-based adversarial facial protection method (MAP) using a fine-tuned score network that jointly optimizes target identity and expression objectives via gradient projection
- Local smoothness regularization and score matching loss optimization to improve perceptual quality of protected images
- Demonstrated effectiveness against online FR APIs and across underrepresented demographic groups (males, darker skin tones)
🛡️ Threat Analysis
MAP crafts semantically meaningful adversarial inputs (emotionally modified faces) that cause facial recognition models to misidentify the subject as a target identity at inference time — this is an adversarial evasion attack deployed as a privacy defense. Tested in black-box settings against commercial FR APIs.