Prompt Pirates Need a Map: Stealing Seeds helps Stealing Prompts

Diffusion models have significantly advanced text-to-image generation, enabling the creation of highly realistic images conditioned on textual prompts and seeds. Given the considerable intellectual and economic value embedded in such prompts, prompt theft poses a critical security and privacy concern. In this paper, we investigate prompt-stealing attacks targeting diffusion models. We reveal that numerical optimization-based prompt recovery methods are fundamentally limited as they do not account for the initial random noise used during image generation. We identify and exploit a noise-generation vulnerability (CWE-339), prevalent in major image-generation frameworks, originating from PyTorch's restriction of seed values to a range of $2^{32}$ when generating the initial random noise on CPUs. Through a large-scale empirical analysis conducted on images shared via the popular platform CivitAI, we demonstrate that approximately 95% of these images' seed values can be effectively brute-forced in 140 minutes per seed using our seed-recovery tool, SeedSnitch. Leveraging the recovered seed, we propose PromptPirate, a genetic algorithm-based optimization method explicitly designed for prompt stealing. PromptPirate surpasses state-of-the-art methods, i.e., PromptStealer, P2HP, and CLIP-Interrogator, achieving an 8-11% improvement in LPIPS similarity. Furthermore, we introduce straightforward and effective countermeasures that render seed stealing, and thus optimization-based prompt stealing, ineffective. We have disclosed our findings responsibly and initiated coordinated mitigation efforts with the developers to address this critical vulnerability.

Key Contributions

Identifies and exploits CWE-339 in PyTorch's CPU seed generation (2^32 seed space), enabling brute-force seed recovery (SeedSnitch recovers ~95% of CivitAI seeds in ~140 minutes each)
Proposes PromptPirate, a genetic algorithm-based prompt-stealing method conditioned on recovered seeds, outperforming PromptStealer, P2HP, and CLIP-Interrogator by 8–11% in LPIPS similarity
Introduces countermeasures that neutralize seed-recovery attacks and responsibly discloses the vulnerability to framework developers

🛡️ Threat Analysis

Model Inversion Attack

The core attack recovers private inference inputs (prompts and seeds) from model outputs (generated images) — a model inversion attack on 'private attributes' used during generation. The adversary reverse-engineers confidential creative IP from observable outputs, fitting the spirit of ML03 even though the private information is inference-time input rather than training data.