No More Guessing: a Verifiable Gradient Inversion Attack in Federated Learning

Gradient inversion attacks threaten client privacy in federated learning by reconstructing training samples from clients' shared gradients. Gradients aggregate contributions from multiple records and existing attacks may fail to disentangle them, yielding incorrect reconstructions with no intrinsic way to certify success. In vision and language, attackers may fall back on human inspection to judge reconstruction plausibility, but this is far less feasible for numerical tabular records, fueling the impression that tabular data is less vulnerable. We challenge this perception by proposing a verifiable gradient inversion attack (VGIA) that provides an explicit certificate of correctness for reconstructed samples. Our method adopts a geometric view of ReLU leakage: the activation boundary of a fully connected layer defines a hyperplane in input space. VGIA introduces an algebraic, subspace-based verification test that detects when a hyperplane-delimited region contains exactly one record. Once isolation is certified, VGIA recovers the corresponding feature vector analytically and reconstructs the target via a lightweight optimization step. Experiments on tabular benchmarks with large batch sizes demonstrate exact record and target recovery in regimes where existing state-of-the-art attacks either fail or cannot assess reconstruction fidelity. Compared to prior geometric approaches, VGIA allocates hyperplane queries more effectively, yielding faster reconstructions with fewer attack rounds.

Key Contributions

Algebraic verification test that certifies when a hyperplane-delimited region contains exactly one training record
Analytic recovery of feature vectors from ReLU activation boundaries in federated learning gradients
Exact reconstruction of tabular training data in large-batch regimes where prior attacks fail

🛡️ Threat Analysis

Model Inversion Attack

Core contribution is reconstructing private training data (tabular records) from shared gradients in federated learning. The attack recovers exact feature vectors and targets from gradient observations, which is a model inversion/data reconstruction attack. The verification mechanism certifies when reconstruction is successful.