Responsible Diffusion: A Comprehensive Survey on Safety, Ethics, and Trust in Diffusion Models

Diffusion models (DMs) have been investigated in various domains due to their ability to generate high-quality data, thereby attracting significant attention. However, similar to traditional deep learning systems, there also exist potential threats to DMs. To provide advanced and comprehensive insights into safety, ethics, and trust in DMs, this survey comprehensively elucidates its framework, threats, and countermeasures. Each threat and its countermeasures are systematically examined and categorized to facilitate thorough analysis. Furthermore, we introduce specific examples of how DMs are used, what dangers they might bring, and ways to protect against these dangers. Finally, we discuss key lessons learned, highlight open challenges related to DM security, and outline prospective research directions in this critical field. This work aims to accelerate progress not only in the technical capabilities of generative artificial intelligence but also in the maturity and wisdom of its application.

Key Contributions

Systematic taxonomy of threats to diffusion models across privacy, robustness, safety, fairness, copyright, and truthfulness dimensions
Comprehensive review of countermeasures against each identified threat category with concrete examples
Discussion of open challenges and future research directions in diffusion model security

🛡️ Threat Analysis

Input Manipulation Attack

Survey covers adversarial robustness of diffusion models — attacks manipulating inference-time inputs and defenses against evasion attacks on DMs.

Output Integrity Attack

Copyright, truthfulness, and safety sections address AI-generated content detection, deepfake threats, content watermarking, and output provenance — core output integrity concerns for diffusion models.

Model Poisoning

Survey explicitly covers backdoor/trojan threats to diffusion models and corresponding countermeasures such as backdoor detection and neural cleansing.