Evaluating Differential Privacy Against Membership Inference in Federated Learning: Insights from the NIST Genomics Red Team Challenge

While Federated Learning (FL) mitigates direct data exposure, the resulting trained models remain susceptible to membership inference attacks (MIAs). This paper presents an empirical evaluation of Differential Privacy (DP) as a defense mechanism against MIAs in FL, leveraging the environment of the 2025 NIST Genomics Privacy-Preserving Federated Learning (PPFL) Red Teaming Event. To improve inference accuracy, we propose a stacking attack strategy that ensembles seven black-box estimators to train a meta-classifier on prediction probabilities and cross-entropy losses. We evaluate this methodology against target models under three privacy configurations: an unprotected convolutional neural network (CNN, $ε=\infty$), a low-privacy DP model ($ε=200$), and a high-privacy DP model ($ε=10$). The attack outperforms all baselines in the No DP and Low Privacy settings and, critically, maintains measurable membership leakage at $ε=200$ where a single-signal LiRA baseline collapses. Evaluated on an independent third-party benchmark, these results provide an empirical characterisation of how stacking-based inference degrades across calibrated DP tiers in FL.

Key Contributions

Stacking-based MIA that ensembles seven black-box estimators to train a meta-classifier on prediction probabilities and cross-entropy losses
Empirical evaluation showing residual membership leakage persists at ε=200 even when single-signal baselines collapse
First-place performance in NIST Genomics PPFL Red Teaming Event for No DP and Low Privacy tiers

🛡️ Threat Analysis

Membership Inference Attack

Primary contribution is a membership inference attack that determines whether specific records were used to train FL models, with empirical evaluation against differential privacy defenses across three privacy tiers.