Client-Cooperative Split Learning
Haiyu Deng 1, Yanna Jiang 1, Guangsheng Yu 1, Qin Wang 1,2, Xu Wang 1, Wei Ni 3, Shiping Chen 2, Ren Ping Liu 1
Published on arXiv
2603.08421
Model Inversion Attack
OWASP ML Top 10 — ML03
Model Theft
OWASP ML Top 10 — ML05
Key Finding
CliCooper reduces label clustering attack success to 0%, activation inversion similarity from 0.50 to 0.03, and model-extraction surrogate accuracy to ~1% while preserving model utility.
CliCooper
Novel technique introduced
Model training is increasingly offered as a service for resource-constrained data owners to build customized models. Split Learning (SL) enables such services by offloading training computation under privacy constraints, and evolves toward serverless and multi-client settings where model segments are distributed across training clients. This cooperative mode assumes partial trust: data owners hide labels and data from trainer clients, while trainer clients produce verifiable training artifacts and ownership proofs. We present CliCooper, a multi-client cooperative SL framework tailored for cooperative model training services in heterogeneous and partially trusted environments, where one client contributes data, while others collectively act as SL trainers. CliCooper bridges the privacy and trust gaps through two new designs. First, differential privacy-based activation protection and secret label obfuscation safeguard data owners' privacy without degrading model performance. Second, a dynamic chained watermarking scheme cryptographically links training stages on model segments across trainers, ensuring verifiable training integrity, robust model provenance, and copyright protection. Experiments show that CliCooper preserves model accuracy while enhancing resilience to privacy and ownership attacks. It reduces the success rate of clustering attacks (which infer label groups from intermediate activation) to 0%, decreases inversion-reconstruction (which recovers training data) similarity from 0.50 to 0.03, and limits model-extraction-based surrogates to about 1% accuracy, comparable to random guessing.
Key Contributions
- DP-based activation protection and secret label obfuscation to prevent activation inversion and label clustering attacks in split learning
- Dynamic chained watermarking scheme that cryptographically links training stages across multiple trainer clients for verifiable provenance and copyright protection
- CliCooper framework demonstrated to reduce clustering attack success to 0%, inversion similarity from 0.50 to 0.03, and model-extraction surrogate accuracy to ~1%
🛡️ Threat Analysis
Two of CliCooper's primary defenses directly counter data reconstruction from intermediate activations: DP-based activation protection reduces inversion-reconstruction similarity from 0.50 to 0.03, and label obfuscation prevents clustering attacks (label group inference from activations) with 0% success rate. Both are adversarially-motivated defenses against private training data/label leakage from model internals.
The dynamic chained watermarking scheme cryptographically binds training stages to trainers for verifiable model provenance and copyright protection — a model ownership/IP defense. The framework also limits model-extraction-based surrogate accuracy to ~1% (near random guessing), directly defending against model theft via extraction.