defense 2026

Improve the Trade-off Between Watermark Strength and Speculative Sampling Efficiency for Language Models

Weiqing He , Xiang Li , Li Shen , Weijie Su , Qi Long

0 citations · 55 references · arXiv (Cornell University)

α

Published on arXiv

2602.01428

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

The proposed mechanism achieves maximal watermark strength while maintaining speculative sampling efficiency, improving detectability without sacrificing throughput — showing the strength-efficiency trade-off is not absolute.

Pseudorandom Draft-Token Acceptance

Novel technique introduced


Watermarking is a principled approach for tracing the provenance of large language model (LLM) outputs, but its deployment in practice is hindered by inference inefficiency. Speculative sampling accelerates inference, with efficiency improving as the acceptance rate between draft and target models increases. Yet recent work reveals a fundamental trade-off: higher watermark strength reduces acceptance, preventing their simultaneous achievement. We revisit this trade-off and show it is not absolute. We introduce a quantitative measure of watermark strength that governs statistical detectability and is maximized when tokens are deterministic functions of pseudorandom numbers. Using this measure, we fully characterize the trade-off as a constrained optimization problem and derive explicit Pareto curves for two existing watermarking schemes. Finally, we introduce a principled mechanism that injects pseudorandomness into draft-token acceptance, ensuring maximal watermark strength while maintaining speculative sampling efficiency. Experiments further show that this approach improves detectability without sacrificing efficiency. Our findings uncover a principle that unites speculative sampling and watermarking, paving the way for their efficient and practical deployment.


Key Contributions

  • Introduces a quantitative measure of watermark strength governing statistical detectability, maximized when tokens are deterministic functions of pseudorandom numbers
  • Fully characterizes the watermark strength vs. speculative sampling efficiency trade-off as a constrained optimization problem with explicit Pareto curves for two existing schemes
  • Proposes a principled mechanism injecting pseudorandomness into draft-token acceptance, provably achieving maximal watermark strength while maintaining speculative sampling efficiency

🛡️ Threat Analysis

Output Integrity Attack

Proposes a watermarking mechanism embedded in LLM text outputs to trace provenance and verify content authenticity — output integrity / content watermarking. The paper explicitly targets statistical detectability of AI-generated text.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
inference_time
Applications
llm text provenanceai-generated text detectionllm inference acceleration