Fed-Listing: Federated Label Distribution Inference in Graph Neural Networks

Graph Neural Networks (GNNs) have been intensively studied for their expressive representation and learning performance on graph-structured data, enabling effective modeling of complex relational dependencies among nodes and edges in various domains. However, the standalone GNNs can unleash threat surfaces and privacy implications, as some sensitive graph-structured data is collected and processed in a centralized setting. To solve this issue, Federated Graph Neural Networks (FedGNNs) are proposed to facilitate collaborative learning over decentralized local graph data, aiming to preserve user privacy. Yet, emerging research indicates that even in these settings, shared model updates, particularly gradients, can unintentionally leak sensitive information of local users. Numerous privacy inference attacks have been explored in traditional federated learning and extended to graph settings, but the problem of label distribution inference in FedGNNs remains largely underexplored. In this work, we introduce Fed-Listing (Federated Label Distribution Inference in GNNs), a novel gradient-based attack designed to infer the private label statistics of target clients in FedGNNs without access to raw data or node features. Fed-Listing only leverages the final-layer gradients exchanged during training to uncover statistical patterns that reveal class proportions in a stealthy manner. An auxiliary shadow dataset is used to generate diverse label partitioning strategies, simulating various client distributions, on which the attack model is obtained. Extensive experiments on four benchmark datasets and three GNN architectures show that Fed-Listing significantly outperforms existing baselines, including random guessing and Decaf, even under challenging non-i.i.d. scenarios. Moreover, applying defense mechanisms can barely reduce our attack performance, unless the model's utility is severely degraded.

Key Contributions

Novel gradient-based attack (Fed-Listing) that infers client label distribution statistics from final-layer gradients in FedGNNs without accessing raw data or node features
Shadow dataset strategy to simulate diverse label partitioning strategies for training the attack model
Empirical demonstration that existing defenses (e.g., gradient perturbation) fail to mitigate the attack without severely degrading model utility across four datasets and three GNN architectures

🛡️ Threat Analysis

Model Inversion Attack

Fed-Listing is a gradient leakage attack in federated learning: the adversary (server) observes gradients shared during FedGNN training and extracts private statistical attributes (label class proportions) of other clients' datasets — a property inference attack that falls under ML03's coverage of recovering private attributes from model/gradient information in federated settings.