Proof-of-Authorship for Diffusion-based AI Generated Content
De Zhang Lee , Han Fang , Ee-Chien Chang
Published on arXiv
2603.17513
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Achieves proof-of-authorship for diffusion-generated images with quantifiable false claim probability, eliminating secret-dependent watermarking vulnerabilities
Proof-of-Authorship (POA)
Novel technique introduced
Recent advancements in AI-generated content (AIGC) have introduced new challenges in intellectual property protection and the authentication of generated objects. We focus on scenarios in which an author seeks to assert authorship of an object generated using latent diffusion models (LDMs), in the presence of adversaries who attempt to falsely claim authorship of objects they did not create. While proof-of-ownership has been studied in the context of multimedia content through techniques such as time-stamping and watermarking, these approaches face notable limitations. In contrast to traditional content creation sources (e.g., cameras), the LDM generation process offers greater control to the author. Specifically, the random seed used during generation can be deliberately chosen. By binding the seed to the author's identity using cryptographic pseudorandom functions, the author can assert to be the creator of the object. We refer to this stronger guarantee as proof-of-authorship, since only the creator of the object can legitimately claim the object. This contrasts with proof-of-ownership via time-stamping or watermarking, where any entity could potentially claim ownership of an object by being the first to timestamp or embed the watermark. We propose a proof-of-authorship framework involving a probabilistic adjudicator who quantifies the probability that a claim is false. Furthermore, unlike prior approaches, the proposed framework does not involve any secret. We explore various attack scenarios and analyze design choices using Stable Diffusion 2.1 (SD2.1) as representative case studies.
Key Contributions
- Proof-of-authorship framework using cryptographic binding of generation seeds to author identity via pseudorandom functions
- Probabilistic adjudicator that quantifies false claim probability without requiring secrets
- Analysis of attack scenarios and statistical confidence in authorship claims for Stable Diffusion 2.1
🛡️ Threat Analysis
Paper addresses output integrity and content provenance for AI-generated images. The framework authenticates authorship of diffusion model outputs and defends against false authorship claims. This is content provenance/authentication (ML09), not model IP protection (ML05) — the goal is to prove who generated a specific image, not to protect the model itself.