SSCL-BW: Sample-Specific Clean-Label Backdoor Watermarking for Dataset Ownership Verification

The rapid advancement of deep neural networks (DNNs) heavily relies on large-scale, high-quality datasets. However, unauthorized commercial use of these datasets severely violates the intellectual property rights of dataset owners. Existing backdoor-based dataset ownership verification methods suffer from inherent limitations: poison-label watermarks are easily detectable due to label inconsistencies, while clean-label watermarks face high technical complexity and failure on high-resolution images. Moreover, both approaches employ static watermark patterns that are vulnerable to detection and removal. To address these issues, this paper proposes a sample-specific clean-label backdoor watermarking (i.e., SSCL-BW). By training a U-Net-based watermarked sample generator, this method generates unique watermarks for each sample, fundamentally overcoming the vulnerability of static watermark patterns. The core innovation lies in designing a composite loss function with three components: target sample loss ensures watermark effectiveness, non-target sample loss guarantees trigger reliability, and perceptual similarity loss maintains visual imperceptibility. During ownership verification, black-box testing is employed to check whether suspicious models exhibit predefined backdoor behaviors. Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed method and its robustness against potential watermark removal attacks.

Key Contributions

U-Net-based watermarked sample generator that produces unique, sample-specific (non-static) backdoor patterns per image, defeating detection and removal attacks that exploit watermark homogeneity
Composite loss function combining target sample loss (watermark effectiveness), non-target sample loss (trigger reliability), and perceptual similarity loss (visual imperceptibility) within a clean-label framework
Black-box dataset ownership verification protocol requiring no white-box access to suspicious models, evaluated for robustness against removal attacks on both standard and high-resolution image benchmarks

🛡️ Threat Analysis

Output Integrity Attack

Watermarks TRAINING DATA to detect unauthorized dataset use — if a suspicious model exhibits predefined backdoor behaviors, the dataset's misappropriation is confirmed. This is training data watermarking for provenance detection, which maps directly to ML09 (output integrity / content provenance). The paper also evaluates robustness against watermark removal attacks, a core ML09 concern. The backdoor mechanism is the vehicle, not the threat: the primary contribution is dataset IP protection, not a backdoor attack.

Details

Domains

vision

Model Types

cnn

Threat Tags

training_timeblack_box

Datasets

CIFAR-10ImageNet

Applications

2025 0 cit.

Output Integrity Attack

67%

SSCL-BW: Sample-Specific Clean-Label Backdoor Watermarking for Dataset Ownership Verification

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

X-Mark: Saliency-Guided Robust Dataset Ownership Verification for Medical Imaging

Open Set Face Forgery Detection via Dual-Level Evidence Collection

ConcealGS: Concealing Invisible Copyright Information in 3D Gaussian Splatting

SpectraNet: FFT-assisted Deep Learning Classifier for Deepfake Face Detection

Efficient and Verifiable Privacy-Preserving Convolutional Computation for CNN Inference with Untrusted Clouds

FreqDebias: Towards Generalizable Deepfake Detection via Consistency-Driven Frequency Debiasing

MarkSplatter: Generalizable Watermarking for 3D Gaussian Splatting Model via Splatter Image Structure

HMARK: Radioactive Multi-Bit Semantic-Latent Watermarking for Diffusion Models