How does Graph Structure Modulate Membership-Inference Risk for Graph Neural Networks?

Graph neural networks (GNNs) have become the standard tool for encoding data and their complex relationships into continuous representations, improving prediction accuracy in several machine learning tasks like node classification and link prediction. However, their use in sensitive applications has raised concerns about the potential leakage of training data. Research on privacy leakage in GNNs has largely been shaped by findings from non-graph domains, such as images and tabular data. We emphasize the need of graph specific analysis and investigate the impact of graph structure on node level membership inference. We formalize MI over node-neighbourhood tuples and investigate two important dimensions: (i) training graph construction and (ii) inference-time edge access. Empirically, snowball's coverage bias often harms generalisation relative to random sampling, while enabling inter-train-test edges at inference improves test accuracy, shrinks the train-test gap, and yields the lowest membership advantage across most of the models and datasets. We further show that the generalisation gap empirically measured as the performance difference between the train and test nodes is an incomplete proxy for MI risk: access to edges dominates-MI can rise or fall independent of gap changes. Finally, we examine the auditability of differentially private GNNs, adapting the definition of statistical exchangeability of train-test data points for graph based models. We show that for node level tasks the inductive splits (random or snowball sampled) break exchangeability, limiting the applicability of standard bounds for membership advantage of differential private models.

Key Contributions

Formalizes membership inference over node-neighbourhood tuples for GNNs, distinguishing graph-specific MI structure from non-graph settings.
Empirically demonstrates that inference-time edge access dominates MI risk — more so than the generalization gap, which is shown to be an incomplete proxy for MI vulnerability.
Shows that inductive graph splits (random or snowball) break statistical exchangeability of train-test nodes, limiting applicability of standard differential privacy membership advantage bounds to GNNs.

🛡️ Threat Analysis

Membership Inference Attack

The paper's primary contribution is a rigorous empirical and theoretical analysis of membership inference risk in GNNs, formalizing MI over node-neighbourhood tuples and examining how graph construction and inference-time edge access affect membership advantage.