LeakBoost: Perceptual-Loss-Based Membership Inference Attack
Amit Kravchik Taub , Fred M. Grabovski , Guy Amit , Yisroel Mirsky
Published on arXiv
2602.05748
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
LeakBoost raises AUC from near-chance (0.53–0.62) to 0.81–0.88 and increases TPR at 1% FPR by over an order of magnitude over strong baseline membership inference attacks.
LeakBoost
Novel technique introduced
Membership inference attacks (MIAs) aim to determine whether a sample was part of a model's training set, posing serious privacy risks for modern machine-learning systems. Existing MIAs primarily rely on static indicators, such as loss or confidence, and do not fully leverage the dynamic behavior of models when actively probed. We propose LeakBoost, a perceptual-loss-based interrogation framework that actively probes a model's internal representations to expose hidden membership signals. Given a candidate input, LeakBoost synthesizes an interrogation image by optimizing a perceptual (activation-space) objective, amplifying representational differences between members and non-members. This image is then analyzed by an off-the-shelf membership detector, without modifying the detector itself. When combined with existing membership inference methods, LeakBoost achieves substantial improvements at low false-positive rates across multiple image classification datasets and diverse neural network architectures. In particular, it raises AUC from near-chance levels (0.53-0.62) to 0.81-0.88, and increases TPR at 1 percent FPR by over an order of magnitude compared to strong baseline attacks. A detailed sensitivity analysis reveals that deeper layers and short, low-learning-rate optimization produce the strongest leakage, and that improvements concentrate in gradient-based detectors. LeakBoost thus offers a modular and computationally efficient way to assess privacy risks in white-box settings, advancing the study of dynamic membership inference.
Key Contributions
- LeakBoost framework that synthesizes interrogation images by optimizing a perceptual (activation-space) objective to amplify representational differences between training members and non-members
- Modular design that wraps existing off-the-shelf membership detectors without modifying them, boosting their performance substantially at low FPR
- Sensitivity analysis identifying that deeper layers, short optimization, and low learning rates maximize leakage, concentrated in gradient-based detectors
🛡️ Threat Analysis
Core contribution is a new membership inference attack method (LeakBoost) that actively probes model internals via perceptual-loss optimization to determine whether a sample was in the training set.