Noise Aggregation Analysis Driven by Small-Noise Injection: Efficient Membership Inference for Diffusion Models

Diffusion models have demonstrated powerful performance in generating high-quality images. A typical example is text-to-image generator like Stable Diffusion. However, their widespread use also poses potential privacy risks. A key concern is membership inference attacks, which attempt to determine whether a particular data sample was used in the model training process. We propose an efficient membership inference attack method against diffusion models. This method is based on the injection of slight noise and the evaluation of the aggregation degree of the noise distribution. The intuition is that the noise prediction patterns of diffusion models for training set samples and non-training set samples exhibit distinguishable differences.Specifically, we suppose that member images exhibit higher aggregation of predicted noise around a certain time step of the diffusion process. In contrast, the predicted noises of non-member images exhibit a more discrete characteristic around the certain time step. Compared with other existing methods, our proposed method requires fewer visits to the target diffusion model. We inject slight noise into the image under test and then determine its membership by analyzing the aggregation degree of the noise distribution predicted by the model. Empirical findings indicate that our method achieves superior performance across multiple datasets. At the same time, our method can also show better attack effects in ASR and AUC when facing large-scale text-to-image diffusion models, proving the scalability of our method.

Key Contributions

Novel MIA that injects low-intensity noise into candidate images and determines membership by measuring the aggregation degree of the diffusion model's predicted noise across adjacent timesteps.
Demonstrates that small-noise injection amplifies the behavioral gap between member and non-member samples while drastically reducing the number of model queries required compared to prior methods.
Empirically validates scalability to large-scale text-to-image diffusion models (e.g., Stable Diffusion), achieving superior AUC and ASR across multiple datasets.

🛡️ Threat Analysis

Membership Inference Attack

The paper's sole contribution is a membership inference attack — determining whether specific images were used in diffusion model training — by analyzing aggregation of predicted noise distributions, a direct binary membership determination task.