Privacy-Preserving Model Transcription with Differentially Private Synthetic Distillation

While many deep learning models trained on private datasets have been deployed in various practical tasks, they may pose a privacy leakage risk as attackers could recover informative data or label knowledge from models. In this work, we present \emph{privacy-preserving model transcription}, a data-free model-to-model conversion solution to facilitate model deployment with a privacy guarantee. To this end, we propose a cooperative-competitive learning approach termed \emph{differentially private synthetic distillation} that learns to convert a pretrained model (teacher) into its privacy-preserving counterpart (student) via a trainable generator without access to private data. The learning collaborates with three players in a unified framework and performs alternate optimization: i)~the generator is learned to generate synthetic data, ii)~the teacher and student accept the synthetic data and compute differential private labels by flexible data or label noisy perturbation, and iii)~the student is updated with noisy labels and the generator is updated by taking the student as a discriminator for adversarial training. We theoretically prove that our approach can guarantee differential privacy and convergence. The transcribed student has good performance and privacy protection, while the resulting generator can generate private synthetic data for downstream tasks. Extensive experiments clearly demonstrate that our approach outperforms 26 state-of-the-arts.

Key Contributions

Data-free model transcription framework that converts a pretrained teacher into a privacy-preserving student without access to original private data
Cooperative-competitive three-player optimization (generator, teacher, student) with differentially private label perturbation supporting both data-sensitive and label-sensitive privacy modes
Formal proofs of differential privacy guarantees and convergence, with the generator producing usable private synthetic data for downstream tasks

🛡️ Threat Analysis

Model Inversion Attack

The paper's stated threat model is that adversaries can 'recover informative data or label knowledge from models' via model inversion attacks (citing Fredrikson et al. 2015, Yang et al. 2019). The proposed defense converts a privately-trained teacher model into a DP-guaranteed student that formally bounds what an adversary can reconstruct about the original training data or labels.