defense 2026

RandMark: On Random Watermarking of Visual Foundation Models

Anna Chistyakova 1, Mikhail Pautov 1,2

0 citations

α

Published on arXiv

2603.10695

Model Theft

OWASP ML Top 10 — ML05

Key Finding

RandMark successfully verifies VFM ownership after fine-tuning and pruning attacks where existing fingerprinting methods fail, with provably low false positive and misdetection rates.

RandMark

Novel technique introduced


Being trained on large and diverse datasets, visual foundation models (VFMs) can be fine-tuned to achieve remarkable performance and efficiency in various downstream computer vision tasks. The high computational cost of data collection and training makes these models valuable assets, which motivates some VFM owners to distribute them alongside a license to protect their intellectual property rights. In this paper, we propose an approach to ownership verification of visual foundation models that leverages a small encoder-decoder network to embed digital watermarks into an internal representation of a hold-out set of input images. The method is based on random watermark embedding, which makes the watermark statistics detectable in functional copies of the watermarked model. Both theoretically and experimentally, we demonstrate that the proposed method yields a low probability of false detection for non-watermarked models and a low probability of false misdetection for watermarked models.


Key Contributions

  • RandMark: a novel watermarking method that embeds binary signatures into VFM hidden representations via a small encoder-decoder network and a hold-out trigger image set
  • Theoretical upper bounds on false positive detection probability (non-watermarked models) and false misdetection probability (watermarked functional copies)
  • Empirical validation on CLIP and DINOv2 showing robustness to downstream fine-tuning (classification, segmentation) and unstructured pruning where existing fingerprinting methods fail

🛡️ Threat Analysis

Model Theft

Watermarks are embedded INTO THE MODEL's internal representations (not into generated content) specifically to prove ownership and detect functional copies of illegally redistributed models — this is a model IP protection defense against model theft.


Details

Domains
vision
Model Types
transformer
Threat Tags
training_timeblack_box
Datasets
CLIP (model)DINOv2 (model)
Applications
model ip protectionownership verificationimage classificationimage segmentation