benchmark 2026

Revisiting the LiRA Membership Inference Attack Under Realistic Assumptions

Najeeb Jebreel , Mona Khalil , David Sánchez , Josep Domingo-Ferrer

0 citations

α

Published on arXiv

2603.07567

Membership Inference Attack

OWASP ML Top 10 — ML04

Key Finding

Under realistic anti-overfitting training and shadow-based thresholds with skewed priors (π ≤ 10%), LiRA's positive predictive value drops substantially from near-perfect levels, suggesting prior evaluations significantly overstated its effectiveness

LiRA (Likelihood-Ratio Attack)

Novel technique introduced


Membership inference attacks (MIAs) have become the standard tool for evaluating privacy leakage in machine learning (ML). Among them, the Likelihood-Ratio Attack (LiRA) is widely regarded as the state of the art when sufficient shadow models are available. However, prior evaluations have often overstated the effectiveness of LiRA by attacking models overconfident on their training samples, calibrating thresholds on target data, assuming balanced membership priors, and/or overlooking attack reproducibility. We re-evaluate LiRA under a realistic protocol that (i) trains models using anti-overfitting (AOF) and transfer learning (TL), when applicable, to reduce overconfidence as in production models; (ii) calibrates decision thresholds using shadow models and data rather than target data; (iii) measures positive predictive value (PPV, or precision) under shadow-based thresholds and skewed membership priors (pi <= 10%); and (iv) quantifies per-sample membership reproducibility across different seeds and training variations. We find that AOF significantly weakens LiRA, while TL further reduces attack effectiveness while improving model accuracy. Under shadow-based thresholds and skewed priors, LiRA's PPV often drops substantially, especially under AOF or AOF+TL. We also find that thresholded vulnerable sets at extremely low FPR show poor reproducibility across runs, while likelihood-ratio rankings are more stable. These results suggest that LiRA, and likely weaker MIAs, are less effective than previously suggested under realistic conditions, and that reliable privacy auditing requires evaluation protocols that reflect practical training practices, feasible attacker assumptions, and reproducibility considerations. Code is available at https://github.com/najeebjebreel/lira_analysis.


Key Contributions

  • Proposes a realistic LiRA evaluation protocol using anti-overfitting and transfer learning to reduce model overconfidence, shadow-based threshold calibration, and skewed membership priors (π ≤ 10%)
  • Demonstrates that LiRA's PPV drops substantially under realistic conditions, particularly when anti-overfitting or transfer learning is applied, challenging prior overly optimistic evaluations
  • Quantifies per-sample reproducibility of LiRA's 'vulnerable' sets, finding low FPR thresholded sets are poorly reproducible across runs while likelihood-ratio rankings are more stable

🛡️ Threat Analysis

Membership Inference Attack

The paper's primary contribution is a rigorous re-evaluation of the LiRA membership inference attack, analyzing its effectiveness under realistic training practices (anti-overfitting, transfer learning), shadow-based thresholds, skewed membership priors, and reproducibility — directly targeting the MIA threat and its use as a privacy auditing benchmark.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
black_boxinference_time
Datasets
CIFAR-10CIFAR-100
Applications
image classificationprivacy auditing