Privacy-Preserving Model Transcription with Differentially Private Synthetic Distillation
Bochao Liu 1,2, Shiming Ge 1, Pengju Wang 1, Shikun Li 1, Tongliang Liu 3
Published on arXiv
2601.19090
Model Inversion Attack
OWASP ML Top 10 — ML03
Key Finding
Outperforms 26 state-of-the-art privacy-preserving distillation and DP training methods while providing formal differential privacy guarantees without requiring access to private training data
Differentially Private Synthetic Distillation (DPSD)
Novel technique introduced
While many deep learning models trained on private datasets have been deployed in various practical tasks, they may pose a privacy leakage risk as attackers could recover informative data or label knowledge from models. In this work, we present \emph{privacy-preserving model transcription}, a data-free model-to-model conversion solution to facilitate model deployment with a privacy guarantee. To this end, we propose a cooperative-competitive learning approach termed \emph{differentially private synthetic distillation} that learns to convert a pretrained model (teacher) into its privacy-preserving counterpart (student) via a trainable generator without access to private data. The learning collaborates with three players in a unified framework and performs alternate optimization: i)~the generator is learned to generate synthetic data, ii)~the teacher and student accept the synthetic data and compute differential private labels by flexible data or label noisy perturbation, and iii)~the student is updated with noisy labels and the generator is updated by taking the student as a discriminator for adversarial training. We theoretically prove that our approach can guarantee differential privacy and convergence. The transcribed student has good performance and privacy protection, while the resulting generator can generate private synthetic data for downstream tasks. Extensive experiments clearly demonstrate that our approach outperforms 26 state-of-the-arts.
Key Contributions
- Data-free model transcription framework that converts a pretrained teacher into a privacy-preserving student without access to original private data
- Cooperative-competitive three-player optimization (generator, teacher, student) with differentially private label perturbation supporting both data-sensitive and label-sensitive privacy modes
- Formal proofs of differential privacy guarantees and convergence, with the generator producing usable private synthetic data for downstream tasks
🛡️ Threat Analysis
The paper's stated threat model is that adversaries can 'recover informative data or label knowledge from models' via model inversion attacks (citing Fredrikson et al. 2015, Yang et al. 2019). The proposed defense converts a privately-trained teacher model into a DP-guaranteed student that formally bounds what an adversary can reconstruct about the original training data or labels.