defense 2025

On the Detectability of Active Gradient Inversion Attacks in Federated Learning

Vincenzo Carletti , Pasquale Foggia , Carlo Mazzocca , Giuseppe Parrella , Mario Vento

0 citations · arXiv

α

Published on arXiv

2511.10502

Model Inversion Attack

OWASP ML Top 10 — ML03

Key Finding

Proposed client-side detection techniques effectively identify active GIAs across diverse FL configurations without any modifications to the standard training protocol, exposing the limited real-world stealthiness of recent attacks.


One of the key advantages of Federated Learning (FL) is its ability to collaboratively train a Machine Learning (ML) model while keeping clients' data on-site. However, this can create a false sense of security. Despite not sharing private data increases the overall privacy, prior studies have shown that gradients exchanged during the FL training remain vulnerable to Gradient Inversion Attacks (GIAs). These attacks allow reconstructing the clients' local data, breaking the privacy promise of FL. GIAs can be launched by either a passive or an active server. In the latter case, a malicious server manipulates the global model to facilitate data reconstruction. While effective, earlier attacks falling under this category have been demonstrated to be detectable by clients, limiting their real-world applicability. Recently, novel active GIAs have emerged, claiming to be far stealthier than previous approaches. This work provides the first comprehensive analysis of these claims, investigating four state-of-the-art GIAs. We propose novel lightweight client-side detection techniques, based on statistically improbable weight structures and anomalous loss and gradient dynamics. Extensive evaluation across several configurations demonstrates that our methods enable clients to effectively detect active GIAs without any modifications to the FL training protocol.


Key Contributions

  • First comprehensive analysis of stealthiness claims of four state-of-the-art active Gradient Inversion Attacks in federated learning
  • Novel lightweight client-side detection methods based on statistically improbable weight structures and anomalous loss/gradient dynamics
  • Demonstrated detection effectiveness across multiple FL configurations without requiring modifications to the standard FL protocol

🛡️ Threat Analysis

Model Inversion Attack

Gradient Inversion Attacks (GIAs) are the canonical ML03 threat: an adversary (here, a malicious FL server) reconstructs clients' private training data from shared gradients. The paper analyzes four state-of-the-art active GIAs and proposes defenses (detection techniques) directly against this data reconstruction threat in federated learning.


Details

Domains
federated-learning
Model Types
federated
Threat Tags
white_boxtraining_time
Applications
federated learning systems