α

Published on arXiv

2512.16126

Membership Inference Attack

OWASP ML Top 10 — ML04

Key Finding

DVIA extracts significantly more membership information on retained data by exploiting the asymmetric prediction stability between member and non-member samples across original and unlearned models, outperforming single-model inference baselines.

DVIA (Dual-View Inference Attack)

Novel technique introduced


Machine unlearning is a newly popularized technique for removing specific training data from a trained model, enabling it to comply with data deletion requests. While it protects the rights of users requesting unlearning, it also introduces new privacy risks. Prior works have primarily focused on the privacy of data that has been unlearned, while the risks to retained data remain largely unexplored. To address this gap, we focus on the privacy risks of retained data and, for the first time, reveal the vulnerabilities introduced by machine unlearning under the dual-view setting, where an adversary can query both the original and the unlearned models. From an information-theoretic perspective, we introduce the concept of {privacy knowledge gain} and demonstrate that the dual-view setting allows adversaries to obtain more information than querying either model alone, thereby amplifying privacy leakage. To effectively demonstrate this threat, we propose DVIA, a Dual-View Inference Attack, which extracts membership information on retained data using black-box queries to both models. DVIA eliminates the need to train an attack model and employs a lightweight likelihood ratio inference module for efficient inference. Experiments across different datasets and model architectures validate the effectiveness of DVIA and highlight the privacy risks inherent in the dual-view setting.


Key Contributions

  • Identifies a previously unexplored attack surface in machine unlearning: users who initiate unlearning requests gain dual-view access (original + unlearned model) that amplifies membership inference on retained data
  • Introduces 'Privacy Knowledge Gain' as an information-theoretic metric formalizing the additional membership information obtainable from the dual-view setting
  • Proposes DVIA, the first MIA targeting retained data in dual-view settings, using Unlearning Confidence Difference (UCD) and a training-free likelihood ratio inference module

🛡️ Threat Analysis

Membership Inference Attack

DVIA is a membership inference attack that determines whether specific data points belong to the retained (training) set, using black-box queries to both the original and unlearned models. This is a binary membership determination — the canonical ML04 threat — applied to a novel dual-view setting introduced by machine unlearning.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
black_boxinference_time
Applications
image classificationmachine unlearning apis