defense 2026

StealthMark: Harmless and Stealthy Ownership Verification for Medical Segmentation via Uncertainty-Guided Backdoors

Qinkai Yu 1,2, Chong Zhang 3, Gaojie Jin 1, Tianjin Huang 1, Wei Zhou 4, Wenhui Li 1, Xiaobo Jin 3, Bo Huang 5, Yitian Zhao 6, Guang Yang 7, Gregory Y.H. Lip 8, Yalin Zheng 8, Aline Villavicencio 1, Yanda Meng 2

0 citations · 52 references · IEEE Transactions on Image Pro...

α

Published on arXiv

2601.17107

Model Theft

OWASP ML Top 10 — ML05

Key Finding

When applied to SAM, StealthMark achieves attack success rates above 95% while maintaining less than 1% drop in Dice and AUC scores, outperforming prior backdoor-based watermarking methods.

StealthMark

Novel technique introduced


Annotating medical data for training AI models is often costly and limited due to the shortage of specialists with relevant clinical expertise. This challenge is further compounded by privacy and ethical concerns associated with sensitive patient information. As a result, well-trained medical segmentation models on private datasets constitute valuable intellectual property requiring robust protection mechanisms. Existing model protection techniques primarily focus on classification and generative tasks, while segmentation models-crucial to medical image analysis-remain largely underexplored. In this paper, we propose a novel, stealthy, and harmless method, StealthMark, for verifying the ownership of medical segmentation models under black-box conditions. Our approach subtly modulates model uncertainty without altering the final segmentation outputs, thereby preserving the model's performance. To enable ownership verification, we incorporate model-agnostic explanation methods, e.g. LIME, to extract feature attributions from the model outputs. Under specific triggering conditions, these explanations reveal a distinct and verifiable watermark. We further design the watermark as a QR code to facilitate robust and recognizable ownership claims. We conducted extensive experiments across four medical imaging datasets and five mainstream segmentation models. The results demonstrate the effectiveness, stealthiness, and harmlessness of our method on the original model's segmentation performance. For example, when applied to the SAM model, StealthMark consistently achieved ASR above 95% across various datasets while maintaining less than a 1% drop in Dice and AUC scores, significantly outperforming backdoor-based watermarking methods and highlighting its strong potential for practical deployment. Our implementation code is made available at: https://github.com/Qinkaiyu/StealthMark.


Key Contributions

  • Uncertainty-guided backdoor watermarking that modulates model uncertainty without altering final segmentation outputs, achieving stealthiness and harmlessness simultaneously
  • Uses LIME-based model-agnostic explanations to reveal an embedded QR code watermark under specific trigger conditions for black-box ownership verification
  • Achieves ASR >95% on SAM with <1% Dice/AUC degradation across four medical imaging datasets and five segmentation architectures

🛡️ Threat Analysis

Model Theft

StealthMark embeds a verifiable watermark INSIDE the model (via uncertainty modulation of weights/behavior) to prove ownership if the model is stolen or used without authorization — this is model IP protection via ownership watermarking, not content provenance. The backdoor trigger mechanism is the vehicle for watermark verification, not an attack goal.


Details

Domains
vision
Model Types
transformercnn
Threat Tags
black_boxtraining_time
Datasets
UK Biobank CMRSEG fundusEchoNetPraNet
Applications
medical image segmentationmodel ip protection