defense 2025

RegionMarker: A Region-Triggered Semantic Watermarking Framework for Embedding-as-a-Service Copyright Protection

Shufan Yang , Zifeng Cheng , Zhiwei Jiang , Yafeng Yin , Cong Wang , Shiping Ge , Yuchen Fu , Qing Gu

0 citations · 34 references · arXiv

α

Published on arXiv

2511.13329

Model Theft

OWASP ML Top 10 — ML05

Model Theft

OWASP LLM Top 10 — LLM10

Key Finding

RegionMarker is the only evaluated method to simultaneously resist all three major EaaS attack types (CSE, paraphrasing, and dimension-perturbation), where all prior methods (EmbMarker, WARDEN, WET, EspeW) fail against at least one attack.

RegionMarker

Novel technique introduced


Embedding-as-a-Service (EaaS) is an effective and convenient deployment solution for addressing various NLP tasks. Nevertheless, recent research has shown that EaaS is vulnerable to model extraction attacks, which could lead to significant economic losses for model providers. For copyright protection, existing methods inject watermark embeddings into text embeddings and use them to detect copyright infringement. However, current watermarking methods often resist only a subset of attacks and fail to provide \textit{comprehensive} protection. To this end, we present the region-triggered semantic watermarking framework called RegionMarker, which defines trigger regions within a low-dimensional space and injects watermarks into text embeddings associated with these regions. By utilizing a secret dimensionality reduction matrix to project onto this subspace and randomly selecting trigger regions, RegionMarker makes it difficult for watermark removal attacks to evade detection. Furthermore, by embedding watermarks across the entire trigger region and using the text embedding as the watermark, RegionMarker is resilient to both paraphrasing and dimension-perturbation attacks. Extensive experiments on various datasets show that RegionMarker is effective in resisting different attack methods, thereby protecting the copyright of EaaS.


Key Contributions

  • Region-triggered semantic watermarking framework (RegionMarker) that defines trigger regions in a low-dimensional semantic subspace rather than relying on trigger words, enabling robustness against paraphrasing attacks
  • Secret dimensionality reduction matrix for projecting into trigger subspace, making watermark removal via dimension-perturbation attacks (permutation, truncation, shift) infeasible
  • Comprehensive protection demonstrated as the first method to simultaneously resist CSE attacks, paraphrasing attacks, and dimension-perturbation attacks against EaaS watermarking

🛡️ Threat Analysis

Model Theft

RegionMarker's primary purpose is protecting EaaS model intellectual property from model extraction attacks — watermarks are injected into output embeddings so that a stolen (extracted) model retains detectable ownership evidence, enabling copyright infringement detection. This is model IP protection, not content provenance tracking.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
black_boxinference_time
Applications
embedding-as-a-servicetext embedding apisnlp copyright protection