Differential Privacy: Gradient Leakage Attacks in Federated Learning Environments
Miguel Fernandez-de-Retana 1,2, Unai Zulaika 2, Rubén Sánchez-Corcuera 2, Aitor Almeida 2
Published on arXiv
2510.23931
Model Inversion Attack
OWASP ML Top 10 — ML03
Key Finding
DP-SGD significantly mitigates gradient leakage-based training data reconstruction at a moderate utility cost, while PDP-SGD maintains accuracy but fails to prevent private data reconstruction in federated learning.
Federated Learning (FL) allows for the training of Machine Learning models in a collaborative manner without the need to share sensitive data. However, it remains vulnerable to Gradient Leakage Attacks (GLAs), which can reveal private information from the shared model updates. In this work, we investigate the effectiveness of Differential Privacy (DP) mechanisms - specifically, DP-SGD and a variant based on explicit regularization (PDP-SGD) - as defenses against GLAs. To this end, we evaluate the performance of several computer vision models trained under varying privacy levels on a simple classification task, and then analyze the quality of private data reconstructions obtained from the intercepted gradients in a simulated FL environment. Our results demonstrate that DP-SGD significantly mitigates the risk of gradient leakage attacks, albeit with a moderate trade-off in model utility. In contrast, PDP-SGD maintains strong classification performance but proves ineffective as a practical defense against reconstruction attacks. These findings highlight the importance of empirically evaluating privacy mechanisms beyond their theoretical guarantees, particularly in distributed learning scenarios where information leakage may represent an unassumable critical threat to data security and privacy.
Key Contributions
- Empirical evaluation of DP-SGD and PDP-SGD as defenses against gradient leakage attacks in a simulated federated learning environment
- Demonstrates DP-SGD significantly mitigates private data reconstruction risk at a moderate classification accuracy cost
- Shows PDP-SGD preserves model utility but is ineffective as a practical defense against gradient inversion reconstruction
🛡️ Threat Analysis
The core threat is gradient leakage: an adversary intercepts FL gradient updates and reconstructs participants' private training data. The paper simulates this reconstruction attack and evaluates DP-SGD and PDP-SGD specifically as defenses against it — passing the 'adversary test' for ML03 with a concrete data reconstruction threat model.