Neighborhood Blending: A Lightweight Inference-Time Defense Against Membership Inference Attacks

In recent years, the widespread adoption of Machine Learning as a Service (MLaaS), particularly in sensitive environments, has raised considerable privacy concerns. Of particular importance are membership inference attacks (MIAs), which exploit behavioral discrepancies between training and non-training data to determine whether a specific record was included in the model's training set, thereby presenting significant privacy risks. Although existing defenses, such as adversarial regularization, DP-SGD, and MemGuard, assist in mitigating these threats, they often entail trade-offs such as compromising utility, increased computational requirements, or inconsistent protection against diverse attack vectors. In this paper, we introduce a novel inference-time defense mechanism called Neighborhood Blending, which mitigates MIAs without retraining the model or incurring significant computational overhead. Our approach operates post-training by smoothing the model's confidence outputs based on the neighborhood of a queried sample. By averaging predictions from similar training samples selected using differentially private sampling, our method establishes a consistent confidence pattern, rendering members and non-members indistinguishable to an adversary while maintaining high utility. Significantly, Neighborhood Blending maintains label integrity (zero label loss) and ensures high utility through an adaptive, "pay-as-you-go" distortion strategy. It is a model-agnostic approach that offers a practical, lightweight solution that enhances privacy without sacrificing model utility. Through extensive experiments across diverse datasets and models, we demonstrate that our defense significantly reduces MIA success rates while preserving model performance, outperforming existing post-hoc defenses like MemGuard and training-time techniques like DP-SGD in terms of utility retention.

Key Contributions

Neighborhood Blending: a post-training, model-agnostic inference-time defense that smooths confidence outputs by averaging predictions from differentially private nearest-neighbor samples
Adaptive 'pay-as-you-go' distortion strategy that ensures zero label loss while maintaining high model utility
Empirical demonstration that Neighborhood Blending outperforms MemGuard and DP-SGD in utility retention while significantly reducing MIA success rates across diverse datasets and models

🛡️ Threat Analysis

Membership Inference Attack

Paper's primary contribution is a defense specifically designed to prevent adversaries from determining whether a specific record was in the training set — the core definition of membership inference attacks. Neighborhood Blending directly reduces MIA success rates by making member and non-member confidence patterns indistinguishable.