Evaluating Differential Privacy Against Membership Inference in Federated Learning: Insights from the NIST Genomics Red Team Challenge
Published on arXiv
2604.12737
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
Attack achieves first-place ranking in No DP (ε=∞) and Low Privacy (ε=200) tiers, maintaining measurable membership leakage at ε=200 where single-signal LiRA baseline fails
Stacking-based MIA
Novel technique introduced
While Federated Learning (FL) mitigates direct data exposure, the resulting trained models remain susceptible to membership inference attacks (MIAs). This paper presents an empirical evaluation of Differential Privacy (DP) as a defense mechanism against MIAs in FL, leveraging the environment of the 2025 NIST Genomics Privacy-Preserving Federated Learning (PPFL) Red Teaming Event. To improve inference accuracy, we propose a stacking attack strategy that ensembles seven black-box estimators to train a meta-classifier on prediction probabilities and cross-entropy losses. We evaluate this methodology against target models under three privacy configurations: an unprotected convolutional neural network (CNN, $ε=\infty$), a low-privacy DP model ($ε=200$), and a high-privacy DP model ($ε=10$). The attack outperforms all baselines in the No DP and Low Privacy settings and, critically, maintains measurable membership leakage at $ε=200$ where a single-signal LiRA baseline collapses. Evaluated on an independent third-party benchmark, these results provide an empirical characterisation of how stacking-based inference degrades across calibrated DP tiers in FL.
Key Contributions
- Stacking-based MIA that ensembles seven black-box estimators to train a meta-classifier on prediction probabilities and cross-entropy losses
- Empirical evaluation showing residual membership leakage persists at ε=200 even when single-signal baselines collapse
- First-place performance in NIST Genomics PPFL Red Teaming Event for No DP and Low Privacy tiers
🛡️ Threat Analysis
Primary contribution is a membership inference attack that determines whether specific records were used to train FL models, with empirical evaluation against differential privacy defenses across three privacy tiers.