Decoupling Generalizability and Membership Privacy Risks in Neural Networks
Published on arXiv
2602.02296
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
PPTP maintains model generalizability significantly better than prior defenses while enhancing protection against membership inference attacks.
PPTP (Privacy-Preserving Training Principle)
Novel technique introduced
A deep learning model usually has to sacrifice some utilities when it acquires some other abilities or characteristics. Privacy preservation has such trade-off relationships with utilities. The loss disparity between various defense approaches implies the potential to decouple generalizability and privacy risks to maximize privacy gain. In this paper, we identify that the model's generalization and privacy risks exist in different regions in deep neural network architectures. Based on the observations that we investigate, we propose Privacy-Preserving Training Principle (PPTP) to protect model components from privacy risks while minimizing the loss in generalizability. Through extensive evaluations, our approach shows significantly better maintenance in model generalizability while enhancing privacy preservation.
Key Contributions
- Identifies that generalization and membership privacy risks are localized in different architectural regions of deep neural networks
- Proposes Privacy-Preserving Training Principle (PPTP) that selectively protects high-risk model components to decouple privacy and utility
- Demonstrates significantly better utility-privacy tradeoff compared to existing membership inference defenses
🛡️ Threat Analysis
Paper explicitly targets 'membership privacy risks' — the threat that an adversary determines whether a specific sample was in the training set. PPTP is a training-time defense against membership inference attacks that decouples this risk from generalizability.