defense 2026

Privacy-Preserving Model Transcription with Differentially Private Synthetic Distillation

Bochao Liu 1,2, Shiming Ge 1, Pengju Wang 1, Shikun Li 1, Tongliang Liu 3

0 citations · 90 references · TPAMI

α

Published on arXiv

2601.19090

Model Inversion Attack

OWASP ML Top 10 — ML03

Key Finding

Outperforms 26 state-of-the-art privacy-preserving distillation and DP training methods while providing formal differential privacy guarantees without requiring access to private training data

Differentially Private Synthetic Distillation (DPSD)

Novel technique introduced


While many deep learning models trained on private datasets have been deployed in various practical tasks, they may pose a privacy leakage risk as attackers could recover informative data or label knowledge from models. In this work, we present \emph{privacy-preserving model transcription}, a data-free model-to-model conversion solution to facilitate model deployment with a privacy guarantee. To this end, we propose a cooperative-competitive learning approach termed \emph{differentially private synthetic distillation} that learns to convert a pretrained model (teacher) into its privacy-preserving counterpart (student) via a trainable generator without access to private data. The learning collaborates with three players in a unified framework and performs alternate optimization: i)~the generator is learned to generate synthetic data, ii)~the teacher and student accept the synthetic data and compute differential private labels by flexible data or label noisy perturbation, and iii)~the student is updated with noisy labels and the generator is updated by taking the student as a discriminator for adversarial training. We theoretically prove that our approach can guarantee differential privacy and convergence. The transcribed student has good performance and privacy protection, while the resulting generator can generate private synthetic data for downstream tasks. Extensive experiments clearly demonstrate that our approach outperforms 26 state-of-the-arts.


Key Contributions

  • Data-free model transcription framework that converts a pretrained teacher into a privacy-preserving student without access to original private data
  • Cooperative-competitive three-player optimization (generator, teacher, student) with differentially private label perturbation supporting both data-sensitive and label-sensitive privacy modes
  • Formal proofs of differential privacy guarantees and convergence, with the generator producing usable private synthetic data for downstream tasks

🛡️ Threat Analysis

Model Inversion Attack

The paper's stated threat model is that adversaries can 'recover informative data or label knowledge from models' via model inversion attacks (citing Fredrikson et al. 2015, Yang et al. 2019). The proposed defense converts a privately-trained teacher model into a DP-guaranteed student that formally bounds what an adversary can reconstruct about the original training data or labels.


Details

Domains
vision
Model Types
cnntransformergan
Threat Tags
training_timeblack_box
Applications
image classificationmodel deploymentprivate model conversion