Score-based Membership Inference on Diffusion Models
Mingxing Rao , Bowen Qu , Daniel Moyer
Published on arXiv
2509.25003
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
SimA achieves consistently strong membership inference performance on DDPM and LDM with a single query, and reveals that LDMs are surprisingly more robust to MIA than pixel-space models due to the VAE's information bottleneck.
SimA
Novel technique introduced
Membership inference attacks (MIAs) against diffusion models have emerged as a pressing privacy concern, as these models may inadvertently reveal whether a given sample was part of their training set. We present a theoretical and empirical study of score-based MIAs, focusing on the predicted noise vectors that diffusion models learn to approximate. We show that the expected denoiser output points toward a kernel-weighted local mean of nearby training samples, such that its norm encodes proximity to the training set and thereby reveals membership. Building on this observation, we propose SimA, a single-query attack that provides a principled, efficient alternative to existing multi-query methods. SimA achieves consistently strong performance across variants of DDPM, Latent Diffusion Model (LDM). Notably, we find that Latent Diffusion Models are surprisingly less vulnerable than pixel-space models, due to the strong information bottleneck imposed by their latent auto-encoder. We further investigate this by differing the regularization hyperparameters ($β$ in $β$-VAE) in latent channel and suggest a strategy to make LDM training more robust to MIA. Our results solidify the theory of score-based MIAs, while highlighting that Latent Diffusion class of methods requires better understanding of inversion for VAE, and not simply inversion of the Diffusion process
Key Contributions
- Theoretical analysis showing that the diffusion denoiser's expected output norm encodes proximity to training samples, providing a principled basis for score-based MIAs
- SimA: a single-query membership inference attack that is more efficient than existing multi-query methods while maintaining strong performance on DDPM and LDM
- Empirical finding that Latent Diffusion Models are less vulnerable than pixel-space models due to the VAE information bottleneck, with a suggested strategy to increase MIA robustness via β-VAE regularization
🛡️ Threat Analysis
The paper's primary contribution is SimA, a new membership inference attack that determines whether a sample was in the diffusion model's training set by exploiting the norm of predicted noise vectors as a proximity signal — a direct ML04 contribution.