defense 2025

WISER: Segmenting watermarked region - an epidemic change-point perspective

Soham Bonnerjee 1, Sayar Karmakar 2, Subhrajyoty Roy 3

0 citations · 61 references · arXiv

α

Published on arXiv

2509.21160

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

WISER outperforms state-of-the-art watermark segmentation baselines in both computational speed and accuracy while providing provable consistency guarantees for multiple watermarked segment detection.

WISER

Novel technique introduced


With the increasing popularity of large language models, concerns over content authenticity have led to the development of myriad watermarking schemes. These schemes can be used to detect a machine-generated text via an appropriate key, while being imperceptible to readers with no such keys. The corresponding detection mechanisms usually take the form of statistical hypothesis testing for the existence of watermarks, spurring extensive research in this direction. However, the finer-grained problem of identifying which segments of a mixed-source text are actually watermarked, is much less explored; the existing approaches either lack scalability or theoretical guarantees robust to paraphrase and post-editing. In this work, we introduce a unique perspective to such watermark segmentation problems through the lens of epidemic change-points. By highlighting the similarities as well as differences of these two problems, we motivate and propose WISER: a novel, computationally efficient, watermark segmentation algorithm. We theoretically validate our algorithm by deriving finite sample error-bounds, and establishing its consistency in detecting multiple watermarked segments in a single text. Complementing these theoretical results, our extensive numerical experiments show that WISER outperforms state-of-the-art baseline methods, both in terms of computational speed as well as accuracy, on various benchmark datasets embedded with diverse watermarking schemes. Our theoretical and empirical findings establish WISER as an effective tool for watermark localization in most settings. It also shows how insights from a classical statistical problem can lead to a theoretically valid and computationally efficient solution of a modern and pertinent problem.


Key Contributions

  • Reframes watermark segmentation as an epidemic change-point detection problem, enabling principled algorithmic design and theoretical analysis
  • Proposes WISER, a computationally efficient algorithm for localizing multiple watermarked segments in a single mixed-source text
  • Derives finite-sample error bounds and consistency guarantees for WISER, the first provably consistent multi-segment watermark localization method

🛡️ Threat Analysis

Output Integrity Attack

WISER detects and localizes LLM-generated watermarked text segments within mixed-source documents — directly addressing AI-generated content provenance and output authenticity verification, the core of ML09.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
inference_time
Datasets
Various benchmark datasets with diverse watermarking schemes
Applications
llm text watermarkingcontent authenticity verificationai-generated text detection