defense 2026

Machine Pareidolia: Protecting Facial Image with Emotional Editing

Binh M. Le , Simon S. Woo

0 citations

α

Published on arXiv

2603.03665

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

MAP outperforms noise-based, makeup-based, and freeform attribute baselines in both qualitative fidelity and quantitative metrics, and successfully evades a commercial FR API in black-box settings

MAP (Machine Pareidolia)

Novel technique introduced


The proliferation of facial recognition (FR) systems has raised privacy concerns in the digital realm, as malicious uses of FR models pose a significant threat. Traditional countermeasures, such as makeup style transfer, have suffered from low transferability in black-box settings and limited applicability across various demographic groups, including males and individuals with darker skin tones. To address these challenges, we introduce a novel facial privacy protection method, dubbed \textbf{MAP}, a pioneering approach that employs human emotion modifications to disguise original identities as target identities in facial images. Our method uniquely fine-tunes a score network to learn dual objectives, target identity and human expression, which are jointly optimized through gradient projection to ensure convergence at a shared local optimum. Additionally, we enhance the perceptual quality of protected images by applying local smoothness regularization and optimizing the score matching loss within our network. Empirical experiments demonstrate that our innovative approach surpasses previous baselines, including noise-based, makeup-based, and freeform attribute methods, in both qualitative fidelity and quantitative metrics. Furthermore, MAP proves its effectiveness against an online FR API and shows advanced adaptability in uncommon photographic scenarios.


Key Contributions

  • Novel emotional-editing-based adversarial facial protection method (MAP) using a fine-tuned score network that jointly optimizes target identity and expression objectives via gradient projection
  • Local smoothness regularization and score matching loss optimization to improve perceptual quality of protected images
  • Demonstrated effectiveness against online FR APIs and across underrepresented demographic groups (males, darker skin tones)

🛡️ Threat Analysis

Input Manipulation Attack

MAP crafts semantically meaningful adversarial inputs (emotionally modified faces) that cause facial recognition models to misidentify the subject as a target identity at inference time — this is an adversarial evasion attack deployed as a privacy defense. Tested in black-box settings against commercial FR APIs.


Details

Domains
visiongenerative
Model Types
diffusioncnn
Threat Tags
black_boxinference_timetargeteddigital
Applications
facial recognitionfacial privacy protection