Score-based Membership Inference on Diffusion Models

Membership inference attacks (MIAs) against diffusion models have emerged as a pressing privacy concern, as these models may inadvertently reveal whether a given sample was part of their training set. We present a theoretical and empirical study of score-based MIAs, focusing on the predicted noise vectors that diffusion models learn to approximate. We show that the expected denoiser output points toward a kernel-weighted local mean of nearby training samples, such that its norm encodes proximity to the training set and thereby reveals membership. Building on this observation, we propose SimA, a single-query attack that provides a principled, efficient alternative to existing multi-query methods. SimA achieves consistently strong performance across variants of DDPM, Latent Diffusion Model (LDM). Notably, we find that Latent Diffusion Models are surprisingly less vulnerable than pixel-space models, due to the strong information bottleneck imposed by their latent auto-encoder. We further investigate this by differing the regularization hyperparameters ($β$ in $β$-VAE) in latent channel and suggest a strategy to make LDM training more robust to MIA. Our results solidify the theory of score-based MIAs, while highlighting that Latent Diffusion class of methods requires better understanding of inversion for VAE, and not simply inversion of the Diffusion process

Key Contributions

Theoretical analysis showing that the diffusion denoiser's expected output norm encodes proximity to training samples, providing a principled basis for score-based MIAs
SimA: a single-query membership inference attack that is more efficient than existing multi-query methods while maintaining strong performance on DDPM and LDM
Empirical finding that Latent Diffusion Models are less vulnerable than pixel-space models due to the VAE information bottleneck, with a suggested strategy to increase MIA robustness via β-VAE regularization

🛡️ Threat Analysis

Membership Inference Attack

The paper's primary contribution is SimA, a new membership inference attack that determines whether a sample was in the diffusion model's training set by exploiting the norm of predicted noise vectors as a proximity signal — a direct ML04 contribution.