Robust Watermarking on Gradient Boosting Decision Trees
Jun Woo Chung 1, Yingjie Lao 2, Weijie Zhao 1
Published on arXiv
2511.09822
Model Theft
OWASP ML Top 10 — ML05
Key Finding
Proposed methods achieve high watermark embedding rates with low accuracy degradation and strong robustness against adversarial fine-tuning attempts on GBDT models
GBDT In-Place Watermarking
Novel technique introduced
Gradient Boosting Decision Trees (GBDTs) are widely used in industry and academia for their high accuracy and efficiency, particularly on structured data. However, watermarking GBDT models remains underexplored compared to neural networks. In this work, we present the first robust watermarking framework tailored to GBDT models, utilizing in-place fine-tuning to embed imperceptible and resilient watermarks. We propose four embedding strategies, each designed to minimize impact on model accuracy while ensuring watermark robustness. Through experiments across diverse datasets, we demonstrate that our methods achieve high watermark embedding rates, low accuracy degradation, and strong resistance to post-deployment fine-tuning.
Key Contributions
- First robust watermarking framework for GBDT models using in-place fine-tuning to embed ownership marks without retraining from scratch
- Four embedding strategies (Wrong Prediction Flip, Outlier Flip, Cluster Center Flip, Confidence Flip) that minimize accuracy degradation while achieving high watermark embedding rates
- Empirical demonstration that watermarks survive post-deployment fine-tuning across diverse tabular datasets
🛡️ Threat Analysis
Watermarks are embedded INSIDE the GBDT model structure (forcing specific predictions on selected inputs) to prove model ownership and detect unauthorized use — this is model IP protection via ownership watermarking, not content provenance. The paper explicitly frames this as defending against tampering and safeguarding intellectual property rights.