attack 2025

Prompt Pirates Need a Map: Stealing Seeds helps Stealing Prompts

Felix Mächtle 1, Ashwath Shetty 2, Jonas Sander 1, Nils Loose 1, Sören Pirk 2, Thomas Eisenbarth 1

0 citations

α

Published on arXiv

2509.09488

Model Inversion Attack

OWASP ML Top 10 — ML03

Key Finding

SeedSnitch recovers ~95% of CivitAI image seeds via brute-force in ~140 minutes; PromptPirate achieves 8–11% LPIPS improvement over state-of-the-art prompt-stealing baselines.

PromptPirate / SeedSnitch

Novel technique introduced


Diffusion models have significantly advanced text-to-image generation, enabling the creation of highly realistic images conditioned on textual prompts and seeds. Given the considerable intellectual and economic value embedded in such prompts, prompt theft poses a critical security and privacy concern. In this paper, we investigate prompt-stealing attacks targeting diffusion models. We reveal that numerical optimization-based prompt recovery methods are fundamentally limited as they do not account for the initial random noise used during image generation. We identify and exploit a noise-generation vulnerability (CWE-339), prevalent in major image-generation frameworks, originating from PyTorch's restriction of seed values to a range of $2^{32}$ when generating the initial random noise on CPUs. Through a large-scale empirical analysis conducted on images shared via the popular platform CivitAI, we demonstrate that approximately 95% of these images' seed values can be effectively brute-forced in 140 minutes per seed using our seed-recovery tool, SeedSnitch. Leveraging the recovered seed, we propose PromptPirate, a genetic algorithm-based optimization method explicitly designed for prompt stealing. PromptPirate surpasses state-of-the-art methods, i.e., PromptStealer, P2HP, and CLIP-Interrogator, achieving an 8-11% improvement in LPIPS similarity. Furthermore, we introduce straightforward and effective countermeasures that render seed stealing, and thus optimization-based prompt stealing, ineffective. We have disclosed our findings responsibly and initiated coordinated mitigation efforts with the developers to address this critical vulnerability.


Key Contributions

  • Identifies and exploits CWE-339 in PyTorch's CPU seed generation (2^32 seed space), enabling brute-force seed recovery (SeedSnitch recovers ~95% of CivitAI seeds in ~140 minutes each)
  • Proposes PromptPirate, a genetic algorithm-based prompt-stealing method conditioned on recovered seeds, outperforming PromptStealer, P2HP, and CLIP-Interrogator by 8–11% in LPIPS similarity
  • Introduces countermeasures that neutralize seed-recovery attacks and responsibly discloses the vulnerability to framework developers

🛡️ Threat Analysis

Model Inversion Attack

The core attack recovers private inference inputs (prompts and seeds) from model outputs (generated images) — a model inversion attack on 'private attributes' used during generation. The adversary reverse-engineers confidential creative IP from observable outputs, fitting the spirit of ML03 even though the private information is inference-time input rather than training data.


Details

Domains
generativevision
Model Types
diffusion
Threat Tags
black_boxinference_time
Datasets
CivitAI
Applications
text-to-image generationprompt marketplacesai-generated creative content