Geometry-Aware Localized Watermarking for Copyright Protection in Embedding-as-a-Service
Zhimin Chen , Xiaojie Liang , Wenbo Xu , Yuxuan Liu , Wei Lu
Published on arXiv
2604.11344
Model Theft
OWASP ML Top 10 — ML05
Key Finding
Maintains robust copyright verification under paraphrasing, dimensional perturbation, and CSE attacks with improved verification stability and low false-positive risk compared to existing methods
GeoMark
Novel technique introduced
Embedding-as-a-Service (EaaS) has become an important semantic infrastructure for natural language and multimedia applications, but it is highly vulnerable to model stealing and copyright infringement. Existing EaaS watermarking methods face a fundamental robustness--utility--verifiability tension: trigger-based methods are fragile to paraphrasing, transformation-based methods are sensitive to dimensional perturbation, and region-based methods may incur false positives due to coincidental geometric affinity. To address this problem, we propose GeoMark, a geometry-aware localized watermarking framework for EaaS copyright protection. GeoMark uses a natural in-manifold embedding as a shared watermark target, constructs geometry-separated anchors with explicit target--anchor margins, and activates watermark injection only within adaptive local neighborhoods. This design decouples where watermarking is triggered from what ownership is attributed to, achieving localized triggering and centralized attribution. Experiments on four benchmark datasets show that GeoMark preserves downstream utility and geometric fidelity while maintaining robust copyright verification under paraphrasing, dimensional perturbation, and CSE (Clustering, Selection, Elimination) attacks, with improved verification stability and low false-positive risk.
Key Contributions
- Geometry-aware localized watermarking framework (GeoMark) that decouples watermark triggering from ownership attribution
- Uses natural in-manifold embeddings as shared watermark targets with geometry-separated anchors
- Achieves robustness against paraphrasing, dimensional perturbation, and CSE attacks while reducing false-positive risks
🛡️ Threat Analysis
The paper addresses model theft of Embedding-as-a-Service (EaaS) models. The watermarking scheme is designed to prove ownership of the embedding model itself when it is stolen via API queries. The watermark is embedded in the model's behavior (its embedding outputs) to verify model ownership, not to track content provenance. This is model IP protection against model stealing attacks.