defense 2026

StealthMark: Harmless and Stealthy Ownership Verification for Medical Segmentation via Uncertainty-Guided Backdoors

Qinkai Yu ^1,2, Chong Zhang ³, Gaojie Jin ¹, Tianjin Huang ¹, Wei Zhou ⁴, Wenhui Li ¹, Xiaobo Jin ³, Bo Huang ⁵, Yitian Zhao ⁶, Guang Yang ⁷, Gregory Y.H. Lip ⁸, Yalin Zheng ⁸, Aline Villavicencio ¹, Yanda Meng ²

¹ University of Exeter

² King Abdullah University of Science and Technology

³ Xi’an Jiaotong-Liverpool University

⁴ Cardiff University

⁵ Chongqing University

⁶ Chinese Academy of Sciences

⁷ Imperial College London

⁸ University of Liverpool

0 citations · 52 references · IEEE Transactions on Image Pro...

Published on arXiv

2601.17107

Model Theft

OWASP ML Top 10 — ML05

Key Finding

When applied to SAM, StealthMark achieves attack success rates above 95% while maintaining less than 1% drop in Dice and AUC scores, outperforming prior backdoor-based watermarking methods.

StealthMark

Novel technique introduced

Annotating medical data for training AI models is often costly and limited due to the shortage of specialists with relevant clinical expertise. This challenge is further compounded by privacy and ethical concerns associated with sensitive patient information. As a result, well-trained medical segmentation models on private datasets constitute valuable intellectual property requiring robust protection mechanisms. Existing model protection techniques primarily focus on classification and generative tasks, while segmentation models-crucial to medical image analysis-remain largely underexplored. In this paper, we propose a novel, stealthy, and harmless method, StealthMark, for verifying the ownership of medical segmentation models under black-box conditions. Our approach subtly modulates model uncertainty without altering the final segmentation outputs, thereby preserving the model's performance. To enable ownership verification, we incorporate model-agnostic explanation methods, e.g. LIME, to extract feature attributions from the model outputs. Under specific triggering conditions, these explanations reveal a distinct and verifiable watermark. We further design the watermark as a QR code to facilitate robust and recognizable ownership claims. We conducted extensive experiments across four medical imaging datasets and five mainstream segmentation models. The results demonstrate the effectiveness, stealthiness, and harmlessness of our method on the original model's segmentation performance. For example, when applied to the SAM model, StealthMark consistently achieved ASR above 95% across various datasets while maintaining less than a 1% drop in Dice and AUC scores, significantly outperforming backdoor-based watermarking methods and highlighting its strong potential for practical deployment. Our implementation code is made available at: https://github.com/Qinkaiyu/StealthMark.

Key Contributions

Uncertainty-guided backdoor watermarking that modulates model uncertainty without altering final segmentation outputs, achieving stealthiness and harmlessness simultaneously
Uses LIME-based model-agnostic explanations to reveal an embedded QR code watermark under specific trigger conditions for black-box ownership verification
Achieves ASR >95% on SAM with <1% Dice/AUC degradation across four medical imaging datasets and five segmentation architectures

🛡️ Threat Analysis

Model Theft

StealthMark embeds a verifiable watermark INSIDE the model (via uncertainty modulation of weights/behavior) to prove ownership if the model is stolen or used without authorization — this is model IP protection via ownership watermarking, not content provenance. The backdoor trigger mechanism is the vehicle for watermark verification, not an attack goal.

Details

Domains

vision

Model Types

transformercnn

Threat Tags

black_boxtraining_time

Datasets

UK Biobank CMRSEG fundusEchoNetPraNet

Applications

medical image segmentationmodel ip protection

Read PDF arXiv DOI Code

StealthMark: Harmless and Stealthy Ownership Verification for Medical Segmentation via Uncertainty-Guided Backdoors

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

SEW: Strengthening Robustness of Black-box DNN Watermarking via Specificity Enhancement

An Information Asymmetry Game for Trigger-based DNN Model Watermarking

Class-feature Watermark: A Resilient Black-box Watermark Against Model Extraction Attacks

A Game Between the Defender and the Attacker for Trigger-based Black-box Model Watermarking

RandMark: On Random Watermarking of Visual Foundation Models

BlackCATT: Black-box Collusion Aware Traitor Tracing in Federated Learning

DeepTracer: Tracing Stolen Model via Deep Coupled Watermarks

ActiveMark: on watermarking of visual foundation models via massive activations