defense 2025

Sigil: Server-Enforced Watermarking in U-Shaped Split Federated Learning via Gradient Injection

Zhengchunmin Dai ¹, Jiaxiong Tang ¹, Peng Sun ², Honglong Chen ³, Liantao Wu ¹

¹ East China Normal University

² Hunan University

³ China University of Petroleum (East China)

0 citations · 43 references · arXiv

Published on arXiv

2511.14422

Model Theft

OWASP ML Top 10 — ML05

Key Finding

Extensive experiments demonstrate Sigil achieves fidelity, robustness, and stealthiness against gradient anomaly detection and a specifically designed adaptive subspace removal attack across multiple datasets and models.

Sigil

Novel technique introduced

In decentralized machine learning paradigms such as Split Federated Learning (SFL) and its variant U-shaped SFL, the server's capabilities are severely restricted. Although this enhances client-side privacy, it also leaves the server highly vulnerable to model theft by malicious clients. Ensuring intellectual property protection for such capability-limited servers presents a dual challenge: watermarking schemes that depend on client cooperation are unreliable in adversarial settings, whereas traditional server-side watermarking schemes are technically infeasible because the server lacks access to critical elements such as model parameters or labels. To address this challenge, this paper proposes Sigil, a mandatory watermarking framework designed specifically for capability-limited servers. Sigil defines the watermark as a statistical constraint on the server-visible activation space and embeds the watermark into the client model via gradient injection, without requiring any knowledge of the data. Besides, we design an adaptive gradient clipping mechanism to ensure that our watermarking process remains both mandatory and stealthy, effectively countering existing gradient anomaly detection methods and a specifically designed adaptive subspace removal attack. Extensive experiments on multiple datasets and models demonstrate Sigil's fidelity, robustness, and stealthiness.

Key Contributions

Sigil: a mandatory server-enforced watermarking framework for U-shaped SFL that embeds a watermark into the client model via gradient injection without requiring access to client data, labels, or model parameters
Watermark defined as a statistical constraint on server-visible activations, enabling black-box ownership verification against a capability-limited server threat model
Adaptive gradient clipping mechanism to keep watermark injection stealthy against gradient anomaly detectors and an adaptive subspace removal attack

🛡️ Threat Analysis

Model Theft

Sigil is a model ownership watermarking defense: the server embeds a verifiable statistical constraint into the client model's activation space via gradient injection during training, enabling ownership verification if a malicious client steals the model. The watermark is in the MODEL to prove IP ownership — the canonical ML05 scenario — not in content outputs.

Details

Domains

visionfederated-learning

Model Types

cnnfederated

Threat Tags

training_timewhite_boxtargeted

Applications

split federated learningmodel ip protectionimage classification

Read PDF arXiv DOI

Sigil: Server-Enforced Watermarking in U-Shaped Split Federated Learning via Gradient Injection

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Collaborative Threshold Watermarking

Robust Client-Server Watermarking for Split Federated Learning

FLClear: Visually Verifiable Multi-Client Watermarking for Federated Learning

Why Neural Structural Obfuscation Can't Kill White-Box Watermarks for Good!

Protecting Deep Neural Network Intellectual Property with Chaos-Based White-Box Watermarking

BlackCATT: Black-box Collusion Aware Traitor Tracing in Federated Learning

Client-Cooperative Split Learning

Persistence of Backdoor-based Watermarks for Neural Networks: A Comprehensive Evaluation