defense 2025

T2SMark: Balancing Robustness and Diversity in Noise-as-Watermark for Diffusion Models

Jindong Yang ^1,2, Han Fang ³, Weiming Zhang ^1,2, Nenghai Yu ^1,2, Kejiang Chen ^1,2

¹ University of Science and Technology of China

² Anhui Province Key Laboratory of Digital Security

³ National University of Singapore

5 citations · 2 influential · arXiv

Published on arXiv

2510.22366

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

T2SMark achieves an optimal balance between watermark robustness and generation diversity on diffusion models with both U-Net and DiT backbones, outperforming Gaussian Shading and PRC-Watermark on their respective weaknesses.

T2SMark / Tail-Truncated Sampling (TTS)

Novel technique introduced

Diffusion models have advanced rapidly in recent years, producing high-fidelity images while raising concerns about intellectual property protection and the misuse of generative AI. Image watermarking for diffusion models, particularly Noise-as-Watermark (NaW) methods, encode watermark as specific standard Gaussian noise vector for image generation, embedding the infomation seamlessly while maintaining image quality. For detection, the generation process is inverted to recover the initial noise vector containing the watermark before extraction. However, existing NaW methods struggle to balance watermark robustness with generation diversity. Some methods achieve strong robustness by heavily constraining initial noise sampling, which degrades user experience, while others preserve diversity but prove too fragile for real-world deployment. To address this issue, we propose T2SMark, a two-stage watermarking scheme based on Tail-Truncated Sampling (TTS). Unlike prior methods that simply map bits to positive or negative values, TTS enhances robustness by embedding bits exclusively in the reliable tail regions while randomly sampling the central zone to preserve the latent distribution. Our two-stage framework then ensures sampling diversity by integrating a randomly generated session key into both encryption pipelines. We evaluate T2SMark on diffusion models with both U-Net and DiT backbones. Extensive experiments show that it achieves an optimal balance between robustness and diversity. Our code is available at \href{https://github.com/0xD009/T2SMark}{https://github.com/0xD009/T2SMark}.

Key Contributions

Tail-Truncated Sampling (TTS): embeds watermark bits only in reliable tail regions of the Gaussian distribution while randomly sampling the central zone, improving robustness over naive positive/negative mappings
Two-stage hierarchical key encryption framework that integrates a randomly generated session key to ensure generation diversity alongside strong watermark robustness
Multidimensional projection of reconstructed Gaussian noise at detection time to fully exploit continuous information for improved decoding accuracy

🛡️ Threat Analysis

Output Integrity Attack

T2SMark embeds watermarks in the generated image outputs of diffusion models (via the initial noise vector) to enable provenance tracking and intellectual property protection of AI-generated content — this is content watermarking of model outputs, not model weight watermarking.

Details

Domains

visiongenerative

Model Types

diffusion

Threat Tags

inference_time

Datasets

LDMStable Diffusion 3

Applications

ai-generated image provenancediffusion model copyright protectioncontent authentication

Read PDF arXiv DOI Code

T2SMark: Balancing Robustness and Diversity in Noise-as-Watermark for Diffusion Models

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

ShapeMark: Robust and Diversity-Preserving Watermarking for Diffusion Models

Neighbor-Aware Localized Concept Erasure in Text-to-Image Diffusion Models

SIGMark: Scalable In-Generation Watermark with Blind Extraction for Video Diffusion

FIND: A Simple yet Effective Baseline for Diffusion-Generated Image Detection

ALIEN: Analytic Latent Watermarking for Controllable Generation

A Difference-in-Difference Approach to Detecting AI-Generated Images

I2VWM: Robust Watermarking for Image to Video Generation

Physics-Driven Spatiotemporal Modeling for AI-Generated Video Detection