Online LLM watermark detection via e-processes

Watermarking for large language models (LLMs) has emerged as an effective tool for distinguishing AI-generated text from human-written content. Statistically, watermark schemes induce dependence between generated tokens and a pseudo-random sequence, reducing watermark detection to a hypothesis testing problem on independence. We develop a unified framework for LLM watermark detection based on e-processes, providing anytime-valid guarantees for online testing. We propose various methods to construct empirically adaptive e-processes that can enhance the detection power. In addition, theoretical results are established to characterize the power properties of the proposed procedures. Some experiments demonstrate that the proposed framework achieves competitive performance compared to existing watermark detection methods.

Key Contributions

Unified e-process framework for LLM watermark detection with anytime-valid Type I error control under arbitrary stopping times, enabling sequential/streaming detection
Empirically adaptive e-process construction methods (adaptive weights, online Grenander algorithm for calibrators) that enhance detection power, sometimes outperforming non-sequential baselines
Asymptotic power-one theoretical results characterizing the power properties of the proposed procedures under the Gumbel-max watermark scheme

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses detection of watermarks embedded in LLM-generated text outputs to distinguish AI-generated from human-written content — a content provenance and output integrity problem. The paper's sole contribution is the watermark detection framework, not model ownership or adversarial input manipulation.

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_time

Applications

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Refined Detection for Gumbel Watermarking

Detecting Cognitive Signatures in Typing Behavior for Non-Intrusive Authorship Verification

Optimizing Token Choice for Code Watermarking: An RL Approach

WaterSearch: A Quality-Aware Search-based Watermarking Framework for Large Language Models

RTLMarker: Protecting LLM-Generated RTL Copyright via a Hardware Watermarking Framework

Adaptive Testing for Segmenting Watermarked Texts From Language Models

AgentMark: Utility-Preserving Behavioral Watermarking for Agents

Real, Fake, or Manipulated? Detecting Machine-Influenced Text