OSI: One-step Inversion Excels in Extracting Diffusion Watermarks
Yuwei Chen 1,2, Zhenliang He 1,2, Jia Tang 1,3, Meina Kan 1,2, Shiguang Shan 1,2
Published on arXiv
2602.09494
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
OSI extracts Gaussian Shading style watermarks 20x faster than multi-step diffusion inversion while achieving higher accuracy and doubling payload capacity across diverse schedulers and diffusion backbones.
OSI (One-step Inversion)
Novel technique introduced
Watermarking is an important mechanism for provenance and copyright protection of diffusion-generated images. Training-free methods, exemplified by Gaussian Shading, embed watermarks into the initial noise of diffusion models with negligible impact on the quality of generated images. However, extracting this type of watermark typically requires multi-step diffusion inversion to obtain precise initial noise, which is computationally expensive and time-consuming. To address this issue, we propose One-step Inversion (OSI), a significantly faster and more accurate method for extracting Gaussian Shading style watermarks. OSI reformulates watermark extraction as a learnable sign classification problem, which eliminates the need for precise regression of the initial noise. Then, we initialize the OSI model from the diffusion backbone and finetune it on synthesized noise-image pairs with a sign classification objective. In this manner, the OSI model is able to accomplish the watermark extraction efficiently in only one step. Our OSI substantially outperforms the multi-step diffusion inversion method: it is 20x faster, achieves higher extraction accuracy, and doubles the watermark payload capacity. Extensive experiments across diverse schedulers, diffusion backbones, and cryptographic schemes consistently show improvements, demonstrating the generality of our OSI framework.
Key Contributions
- Reformulates watermark extraction as a learnable sign classification problem, eliminating the need for precise initial-noise regression
- Initializes the OSI model from the diffusion backbone and fine-tunes on synthesized noise-image pairs, enabling single-step extraction
- Achieves 20x speedup over multi-step diffusion inversion while improving extraction accuracy and doubling watermark payload capacity
🛡️ Threat Analysis
Addresses content watermarking of diffusion-generated images for provenance and copyright protection — directly the output integrity / content provenance domain. OSI improves the extraction (verification) component of a content watermarking system, making it faster and more accurate.