Watermarking Diffusion Language Models

We introduce the first watermark tailored for diffusion language models (DLMs), an emergent LLM paradigm able to generate tokens in arbitrary order, in contrast to standard autoregressive language models (ARLMs) which generate tokens sequentially. While there has been much work in ARLM watermarking, a key challenge when attempting to apply these schemes directly to the DLM setting is that they rely on previously generated tokens, which are not always available with DLM generation. In this work we address this challenge by: (i) applying the watermark in expectation over the context even when some context tokens are yet to be determined, and (ii) promoting tokens which increase the watermark strength when used as context for other tokens. This is accomplished while keeping the watermark detector unchanged. Our experimental evaluation demonstrates that the DLM watermark leads to a >99% true positive rate with minimal quality impact and achieves similar robustness to existing ARLM watermarks, enabling for the first time reliable DLM watermarking.

Key Contributions

First watermarking scheme designed for diffusion language models (DLMs), which generate tokens in arbitrary order rather than sequentially
Applies watermark signal in expectation over context even when some context tokens are undetermined at generation time
Promotes tokens that strengthen the watermark signal when used as context for other tokens, without modifying the watermark detector

🛡️ Threat Analysis

Output Integrity Attack

Embeds detectable watermarks in diffusion language model TEXT OUTPUTS (not model weights) to trace AI-generated content provenance — canonical ML09 content watermarking for LLM outputs.

Details

Domains

nlpgenerative

Model Types

llmtransformer

Threat Tags

inference_time

Applications

2026 0 cit.

Output Integrity Attack

82%

Watermarking Diffusion Language Models

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

dgMARK: Decoding-Guided Watermarking for Diffusion Language Models

Watermarking Discrete Diffusion Language Models

LR-DWM: Efficient Watermarking for Diffusion Language Models

Improve the Trade-off Between Watermark Strength and Speculative Sampling Efficiency for Language Models

DEER: Disentangled Mixture of Experts with Instance-Adaptive Routing for Generalizable Machine-Generated Text Detection

DetectAnyLLM: Towards Generalizable and Robust Detection of Machine-Generated Text Across Domains and Models

DP-MGTD: Privacy-Preserving Machine-Generated Text Detection via Adaptive Differentially Private Entity Sanitization

MirrorMark: A Distortion-Free Multi-Bit Watermark for Large Language Models