defense arXiv Nov 12, 2025 · Nov 2025
Yunfei Yang, Xiaojun Chen, Yuexin Xuan et al. · Chinese Academy of Sciences · State Key Laboratory of Cyberspace Security Defense +2 more
Embeds coupled watermarks in models that adversaries inevitably carry over when stealing via query-based extraction attacks
Model Theft vision
Model watermarking techniques can embed watermark information into the protected model for ownership declaration by constructing specific input-output pairs. However, existing watermarks are easily removed when facing model stealing attacks, and make it difficult for model owners to effectively verify the copyright of stolen models. In this paper, we analyze the root cause of the failure of current watermarking methods under model stealing scenarios and then explore potential solutions. Specifically, we introduce a robust watermarking framework, DeepTracer, which leverages a novel watermark samples construction method and a same-class coupling loss constraint. DeepTracer can incur a high-coupling model between watermark task and primary task that makes adversaries inevitably learn the hidden watermark task when stealing the primary task functionality. Furthermore, we propose an effective watermark samples filtering mechanism that elaborately select watermark key samples used in model ownership verification to enhance the reliability of watermarks. Extensive experiments across multiple datasets and models demonstrate that our method surpasses existing approaches in defending against various model stealing attacks, as well as watermark attacks, and achieves new state-of-the-art effectiveness and robustness.
cnn transformer Chinese Academy of Sciences · State Key Laboratory of Cyberspace Security Defense · University of Chinese Academy of Sciences +1 more
defense arXiv Dec 16, 2025 · Dec 2025
Yunfei Yang, Xiaojun Chen, Zhendong Zhao et al. · Chinese Academy of Sciences · University of Chinese Academy of Sciences +1 more
Defends model IP by embedding frequency-domain compressed watermark samples into black-box models, resisting removal and forgery attacks.
Model Theft visionnlpaudio
The rapid advancement of deep learning has turned models into highly valuable assets due to their reliance on massive data and costly training processes. However, these models are increasingly vulnerable to leakage and theft, highlighting the critical need for robust intellectual property protection. Model watermarking has emerged as an effective solution, with black-box watermarking gaining significant attention for its practicality and flexibility. Nonetheless, existing black-box methods often fail to better balance covertness (hiding the watermark to prevent detection and forgery) and robustness (ensuring the watermark resists removal)-two essential properties for real-world copyright verification. In this paper, we propose ComMark, a novel black-box model watermarking framework that leverages frequency-domain transformations to generate compressed, covert, and attack-resistant watermark samples by filtering out high-frequency information. To further enhance watermark robustness, our method incorporates simulated attack scenarios and a similarity loss during training. Comprehensive evaluations across diverse datasets and architectures demonstrate that ComMark achieves state-of-the-art performance in both covertness and robustness. Furthermore, we extend its applicability beyond image recognition to tasks including speech recognition, sentiment analysis, image generation, image captioning, and video recognition, underscoring its versatility and broad applicability.
cnn transformer Chinese Academy of Sciences · University of Chinese Academy of Sciences · Nankai University