No More Guessing: a Verifiable Gradient Inversion Attack in Federated Learning
Francesco Diana 1,2, Chuan Xu 1,2,3,4, André Nusser 1,2,3,4, Giovanni Neglia 1,2
Published on arXiv
2604.15063
Model Inversion Attack
OWASP ML Top 10 — ML03
Key Finding
Achieves exact record and target recovery in large-batch federated learning settings where existing state-of-the-art gradient inversion attacks fail or cannot assess reconstruction fidelity
VGIA
Novel technique introduced
Gradient inversion attacks threaten client privacy in federated learning by reconstructing training samples from clients' shared gradients. Gradients aggregate contributions from multiple records and existing attacks may fail to disentangle them, yielding incorrect reconstructions with no intrinsic way to certify success. In vision and language, attackers may fall back on human inspection to judge reconstruction plausibility, but this is far less feasible for numerical tabular records, fueling the impression that tabular data is less vulnerable. We challenge this perception by proposing a verifiable gradient inversion attack (VGIA) that provides an explicit certificate of correctness for reconstructed samples. Our method adopts a geometric view of ReLU leakage: the activation boundary of a fully connected layer defines a hyperplane in input space. VGIA introduces an algebraic, subspace-based verification test that detects when a hyperplane-delimited region contains exactly one record. Once isolation is certified, VGIA recovers the corresponding feature vector analytically and reconstructs the target via a lightweight optimization step. Experiments on tabular benchmarks with large batch sizes demonstrate exact record and target recovery in regimes where existing state-of-the-art attacks either fail or cannot assess reconstruction fidelity. Compared to prior geometric approaches, VGIA allocates hyperplane queries more effectively, yielding faster reconstructions with fewer attack rounds.
Key Contributions
- Algebraic verification test that certifies when a hyperplane-delimited region contains exactly one training record
- Analytic recovery of feature vectors from ReLU activation boundaries in federated learning gradients
- Exact reconstruction of tabular training data in large-batch regimes where prior attacks fail
🛡️ Threat Analysis
Core contribution is reconstructing private training data (tabular records) from shared gradients in federated learning. The attack recovers exact feature vectors and targets from gradient observations, which is a model inversion/data reconstruction attack. The verification mechanism certifies when reconstruction is successful.