survey 2025

Responsible Diffusion: A Comprehensive Survey on Safety, Ethics, and Trust in Diffusion Models

Kang Wei 1, Xin Yuan 2, Fushuo Huo 3, Chuan Ma 4, Long Yuan 5, Songze Li 1, Ming Ding 2, Dacheng Tao 3

1 citations · 122 references · arXiv

α

Published on arXiv

2509.22723

Input Manipulation Attack

OWASP ML Top 10 — ML01

Output Integrity Attack

OWASP ML Top 10 — ML09

Model Poisoning

OWASP ML Top 10 — ML10

Key Finding

Provides a comprehensive security framework for diffusion models covering six threat dimensions (privacy, robustness, safety, fairness, copyright, truthfulness) with systematically organized countermeasures and open research challenges.


Diffusion models (DMs) have been investigated in various domains due to their ability to generate high-quality data, thereby attracting significant attention. However, similar to traditional deep learning systems, there also exist potential threats to DMs. To provide advanced and comprehensive insights into safety, ethics, and trust in DMs, this survey comprehensively elucidates its framework, threats, and countermeasures. Each threat and its countermeasures are systematically examined and categorized to facilitate thorough analysis. Furthermore, we introduce specific examples of how DMs are used, what dangers they might bring, and ways to protect against these dangers. Finally, we discuss key lessons learned, highlight open challenges related to DM security, and outline prospective research directions in this critical field. This work aims to accelerate progress not only in the technical capabilities of generative artificial intelligence but also in the maturity and wisdom of its application.


Key Contributions

  • Systematic taxonomy of threats to diffusion models across privacy, robustness, safety, fairness, copyright, and truthfulness dimensions
  • Comprehensive review of countermeasures against each identified threat category with concrete examples
  • Discussion of open challenges and future research directions in diffusion model security

🛡️ Threat Analysis

Input Manipulation Attack

Survey covers adversarial robustness of diffusion models — attacks manipulating inference-time inputs and defenses against evasion attacks on DMs.

Output Integrity Attack

Copyright, truthfulness, and safety sections address AI-generated content detection, deepfake threats, content watermarking, and output provenance — core output integrity concerns for diffusion models.

Model Poisoning

Survey explicitly covers backdoor/trojan threats to diffusion models and corresponding countermeasures such as backdoor detection and neural cleansing.


Details

Domains
visiongenerativemultimodal
Model Types
diffusion
Threat Tags
training_timeinference_timedigital
Applications
text-to-image generationimage synthesisgenerative ai