Improve the Trade-off Between Watermark Strength and Speculative Sampling Efficiency for Language Models
Weiqing He , Xiang Li , Li Shen , Weijie Su , Qi Long
Published on arXiv
2602.01428
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
The proposed mechanism achieves maximal watermark strength while maintaining speculative sampling efficiency, improving detectability without sacrificing throughput — showing the strength-efficiency trade-off is not absolute.
Pseudorandom Draft-Token Acceptance
Novel technique introduced
Watermarking is a principled approach for tracing the provenance of large language model (LLM) outputs, but its deployment in practice is hindered by inference inefficiency. Speculative sampling accelerates inference, with efficiency improving as the acceptance rate between draft and target models increases. Yet recent work reveals a fundamental trade-off: higher watermark strength reduces acceptance, preventing their simultaneous achievement. We revisit this trade-off and show it is not absolute. We introduce a quantitative measure of watermark strength that governs statistical detectability and is maximized when tokens are deterministic functions of pseudorandom numbers. Using this measure, we fully characterize the trade-off as a constrained optimization problem and derive explicit Pareto curves for two existing watermarking schemes. Finally, we introduce a principled mechanism that injects pseudorandomness into draft-token acceptance, ensuring maximal watermark strength while maintaining speculative sampling efficiency. Experiments further show that this approach improves detectability without sacrificing efficiency. Our findings uncover a principle that unites speculative sampling and watermarking, paving the way for their efficient and practical deployment.
Key Contributions
- Introduces a quantitative measure of watermark strength governing statistical detectability, maximized when tokens are deterministic functions of pseudorandom numbers
- Fully characterizes the watermark strength vs. speculative sampling efficiency trade-off as a constrained optimization problem with explicit Pareto curves for two existing schemes
- Proposes a principled mechanism injecting pseudorandomness into draft-token acceptance, provably achieving maximal watermark strength while maintaining speculative sampling efficiency
🛡️ Threat Analysis
Proposes a watermarking mechanism embedded in LLM text outputs to trace provenance and verify content authenticity — output integrity / content watermarking. The paper explicitly targets statistical detectability of AI-generated text.