Luminark: Training-free, Probabilistically-Certified Watermarking for General Vision Generative Models
Jiayi Xu 1, Zhang Zhang 1, Yuanrui Zhang 1, Ruitao Chen 1, Yixian Xu 1, Tianyu He 2, Di He 1
Published on arXiv
2601.01085
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Luminark achieves probabilistically-certified watermark detection with exponentially low false positive rates and strong robustness against common image transformations across nine generative models, while substantially outperforming prior certified methods on image quality.
Luminark
Novel technique introduced
In this paper, we introduce \emph{Luminark}, a training-free and probabilistically-certified watermarking method for general vision generative models. Our approach is built upon a novel watermark definition that leverages patch-level luminance statistics. Specifically, the service provider predefines a binary pattern together with corresponding patch-level thresholds. To detect a watermark in a given image, we evaluate whether the luminance of each patch surpasses its threshold and then verify whether the resulting binary pattern aligns with the target one. A simple statistical analysis demonstrates that the false positive rate of the proposed method can be effectively controlled, thereby ensuring certified detection. To enable seamless watermark injection across different paradigms, we leverage the widely adopted guidance technique as a plug-and-play mechanism and develop the \emph{watermark guidance}. This design enables Luminark to achieve generality across state-of-the-art generative models without compromising image quality. Empirically, we evaluate our approach on nine models spanning diffusion, autoregressive, and hybrid frameworks. Across all evaluations, Luminark consistently demonstrates high detection accuracy, strong robustness against common image transformations, and good performance on visual quality.
Key Contributions
- Novel patch-level luminance binary-pattern watermark definition with mathematically certified false-positive rate control.
- Training-free 'watermark guidance' injection mechanism that generalizes as a plug-and-play module across diffusion, autoregressive, and hybrid generative frameworks.
- Empirical evaluation across nine state-of-the-art vision generative models demonstrating high detection accuracy, transformation robustness, and preserved image quality.
🛡️ Threat Analysis
Embeds watermarks in model OUTPUT images (not model weights) to prove content provenance and enable detection of AI-generated imagery — classic output integrity / content watermarking.