defense 2026

Luminark: Training-free, Probabilistically-Certified Watermarking for General Vision Generative Models

Jiayi Xu ¹, Zhang Zhang ¹, Yuanrui Zhang ¹, Ruitao Chen ¹, Yixian Xu ¹, Tianyu He ², Di He ¹

¹ Peking University

² Microsoft Research Asia

0 citations · 57 references · arXiv

Published on arXiv

2601.01085

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

Luminark achieves probabilistically-certified watermark detection with exponentially low false positive rates and strong robustness against common image transformations across nine generative models, while substantially outperforming prior certified methods on image quality.

Luminark

Novel technique introduced

In this paper, we introduce \emph{Luminark}, a training-free and probabilistically-certified watermarking method for general vision generative models. Our approach is built upon a novel watermark definition that leverages patch-level luminance statistics. Specifically, the service provider predefines a binary pattern together with corresponding patch-level thresholds. To detect a watermark in a given image, we evaluate whether the luminance of each patch surpasses its threshold and then verify whether the resulting binary pattern aligns with the target one. A simple statistical analysis demonstrates that the false positive rate of the proposed method can be effectively controlled, thereby ensuring certified detection. To enable seamless watermark injection across different paradigms, we leverage the widely adopted guidance technique as a plug-and-play mechanism and develop the \emph{watermark guidance}. This design enables Luminark to achieve generality across state-of-the-art generative models without compromising image quality. Empirically, we evaluate our approach on nine models spanning diffusion, autoregressive, and hybrid frameworks. Across all evaluations, Luminark consistently demonstrates high detection accuracy, strong robustness against common image transformations, and good performance on visual quality.

Key Contributions

Novel patch-level luminance binary-pattern watermark definition with mathematically certified false-positive rate control.
Training-free 'watermark guidance' injection mechanism that generalizes as a plug-and-play module across diffusion, autoregressive, and hybrid generative frameworks.
Empirical evaluation across nine state-of-the-art vision generative models demonstrating high detection accuracy, transformation robustness, and preserved image quality.

🛡️ Threat Analysis

Output Integrity Attack

Embeds watermarks in model OUTPUT images (not model weights) to prove content provenance and enable detection of AI-generated imagery — classic output integrity / content watermarking.

Details

Domains

visiongenerative

Model Types

diffusiongan

Threat Tags

inference_time

Datasets

Stable Diffusion 2.1 (and 8 additional diffusion/AR/hybrid models)

Applications

ai-generated image provenancecontent watermarkingimage generation

Read PDF arXiv DOI

Luminark: Training-free, Probabilistically-Certified Watermarking for General Vision Generative Models

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

SAiW: Source-Attributable Invisible Watermarking for Proactive Deepfake Defense

Moiré Video Authentication: A Physical Signature Against AI Video Generation

Beyond Artifacts: Real-Centric Envelope Modeling for Reliable AI-Generated Image Detection

Causal Fingerprints of AI Generative Models

Multi-Feature Fusion Approach for Generative AI Images Detection

CIPHER: Counterfeit Image Pattern High-level Examination via Representation

R$^2$BD: A Reconstruction-Based Method for Generalizable and Efficient Detection of Fake Images

ZK-WAGON: Imperceptible Watermark for Image Generation Models using ZK-SNARKs