defense 2025

From Essence to Defense: Adaptive Semantic-aware Watermarking for Embedding-as-a-Service Copyright Protection

Hao Li ^1,2, Yubing Ren ^1,2, Yanan Cao ^1,2, Yingjie Li ^1,2, Fang Fang ^1,2, Xuebin Wang ^1,2

¹ Chinese Academy of Sciences

² University of Chinese Academy of Sciences

0 citations · 45 references · arXiv

Published on arXiv

2512.16439

Model Theft

OWASP ML Top 10 — ML05

Model Theft

OWASP LLM Top 10 — LLM10

Key Finding

SemMark outperforms trigger-based and transformation-based baselines on verifiability, diversity, stealthiness, and harmlessness across four NLP benchmarks while resisting the proposed Detect-Sampling and Dimensionality-Reduction attacks.

SemMark

Novel technique introduced

Benefiting from the superior capabilities of large language models in natural language understanding and generation, Embeddings-as-a-Service (EaaS) has emerged as a successful commercial paradigm on the web platform. However, prior studies have revealed that EaaS is vulnerable to imitation attacks. Existing methods protect the intellectual property of EaaS through watermarking techniques, but they all ignore the most important properties of embedding: semantics, resulting in limited harmlessness and stealthiness. To this end, we propose SemMark, a novel semantic-based watermarking paradigm for EaaS copyright protection. SemMark employs locality-sensitive hashing to partition the semantic space and inject semantic-aware watermarks into specific regions, ensuring that the watermark signals remain imperceptible and diverse. In addition, we introduce the adaptive watermark weight mechanism based on the local outlier factor to preserve the original embedding distribution. Furthermore, we propose Detect-Sampling and Dimensionality-Reduction attacks and construct four scenarios to evaluate the watermarking method. Extensive experiments are conducted on four popular NLP datasets, and SemMark achieves superior verifiability, diversity, stealthiness, and harmlessness.

Key Contributions

SemMark: locality-sensitive hashing partitions the semantic embedding space to inject diverse, semantics-aware watermarks rather than fixed trigger or global transformation signals
Adaptive watermark weight mechanism using Local Outlier Factor to preserve the original embedding distribution and improve stealthiness
Two novel watermark-removal attacks — Detect-Sampling and Dimensionality-Reduction — plus a four-scenario evaluation framework for EaaS watermarking robustness

🛡️ Threat Analysis

Model Theft

SemMark watermarks the outputs (embeddings) of an EaaS model to prove ownership and detect imitation attacks — the primary threat is model theft via API querying, and the watermark's purpose is to verify model IP if a clone is discovered. This maps to ML05 (model theft defense via watermarking), not ML09 (content provenance), because the watermark protects the MODEL's intellectual property rather than tracking the provenance of individual content outputs.

Details

Domains

nlp

Model Types

llmtransformer

Threat Tags

black_boxinference_time

Datasets

SST-2AGNewsMINDAmazon

Applications

embeddings-as-a-servicetext embeddingsnlp api services

Read PDF arXiv DOI

From Essence to Defense: Adaptive Semantic-aware Watermarking for Embedding-as-a-Service Copyright Protection

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Practical Secure Inference Algorithm for Fine-tuned Large Language Model Based on Fully Homomorphic Encryption

RegionMarker: A Region-Triggered Semantic Watermarking Framework for Embedding-as-a-Service Copyright Protection

FNF: Functional Network Fingerprint for Large Language Models

Reading Between the Lines: Towards Reliable Black-box LLM Fingerprinting via Zeroth-order Gradient Estimation

Watermarks for Embeddings-as-a-Service Large Language Models

Towards Distillation-Resistant Large Language Models: An Information-Theoretic Perspective

SecureInfer: Heterogeneous TEE-GPU Architecture for Privacy-Critical Tensors for Large Language Model Deployment

EditMF: Drawing an Invisible Fingerprint for Your Large Language Models