defense 2025

From Risk to Resilience: Towards Assessing and Mitigating the Risk of Data Reconstruction Attacks in Federated Learning

Xiangrui Xu 1,2, Zhize Li 3, Yufei Han 1, Bin Wang 4, Jiqiang Liu 1,5, Wei Wang 2

1 citations · 56 references · USENIX Security

α

Published on arXiv

2512.15460

Model Inversion Attack

OWASP ML Top 10 — ML03

Key Finding

DRA risk in FL is governed by spectral properties of the Jacobian of shared model updates, providing a unified theoretical explanation for why existing defenses work and enabling adaptive noise-based defenses that protect privacy without accuracy loss.

InvLoss / InvRE

Novel technique introduced


Data Reconstruction Attacks (DRA) pose a significant threat to Federated Learning (FL) systems by enabling adversaries to infer sensitive training data from local clients. Despite extensive research, the question of how to characterize and assess the risk of DRAs in FL systems remains unresolved due to the lack of a theoretically-grounded risk quantification framework. In this work, we address this gap by introducing Invertibility Loss (InvLoss) to quantify the maximum achievable effectiveness of DRAs for a given data instance and FL model. We derive a tight and computable upper bound for InvLoss and explore its implications from three perspectives. First, we show that DRA risk is governed by the spectral properties of the Jacobian matrix of exchanged model updates or feature embeddings, providing a unified explanation for the effectiveness of defense methods. Second, we develop InvRE, an InvLoss-based DRA risk estimator that offers attack method-agnostic, comprehensive risk evaluation across data instances and model architectures. Third, we propose two adaptive noise perturbation defenses that enhance FL privacy without harming classification accuracy. Extensive experiments on real-world datasets validate our framework, demonstrating its potential for systematic DRA risk evaluation and mitigation in FL systems.


Key Contributions

  • InvLoss: a theoretically-grounded upper bound on the maximum achievable effectiveness of data reconstruction attacks for a given data instance and FL model, derived from spectral properties of the Jacobian of exchanged gradients/embeddings
  • InvRE: an attack-method-agnostic DRA risk estimator enabling comprehensive evaluation across data instances and model architectures without requiring a specific attack implementation
  • Two adaptive noise perturbation defenses that improve FL privacy guarantees without degrading classification accuracy

🛡️ Threat Analysis

Model Inversion Attack

The paper's entire focus is on Data Reconstruction Attacks in FL, where an adversary reconstructs private client training data from exchanged model updates or feature embeddings. The paper introduces InvLoss to bound DRA effectiveness, develops InvRE as an attack-agnostic risk estimator, and proposes noise perturbation defenses evaluated against reconstruction attacks — a textbook ML03 scenario.


Details

Domains
federated-learningvision
Model Types
federatedcnn
Threat Tags
white_boxtraining_timedigital
Applications
federated learningimage classification