benchmark 2026

How does Graph Structure Modulate Membership-Inference Risk for Graph Neural Networks?

Megha Khosla

0 citations · 36 references · arXiv

α

Published on arXiv

2601.17130

Membership Inference Attack

OWASP ML Top 10 — ML04

Key Finding

Inference-time access to cross-split edges is the dominant factor in GNN membership inference risk, causing MI advantage to rise or fall independently of the generalization gap, invalidating gap-as-proxy assumptions.


Graph neural networks (GNNs) have become the standard tool for encoding data and their complex relationships into continuous representations, improving prediction accuracy in several machine learning tasks like node classification and link prediction. However, their use in sensitive applications has raised concerns about the potential leakage of training data. Research on privacy leakage in GNNs has largely been shaped by findings from non-graph domains, such as images and tabular data. We emphasize the need of graph specific analysis and investigate the impact of graph structure on node level membership inference. We formalize MI over node-neighbourhood tuples and investigate two important dimensions: (i) training graph construction and (ii) inference-time edge access. Empirically, snowball's coverage bias often harms generalisation relative to random sampling, while enabling inter-train-test edges at inference improves test accuracy, shrinks the train-test gap, and yields the lowest membership advantage across most of the models and datasets. We further show that the generalisation gap empirically measured as the performance difference between the train and test nodes is an incomplete proxy for MI risk: access to edges dominates-MI can rise or fall independent of gap changes. Finally, we examine the auditability of differentially private GNNs, adapting the definition of statistical exchangeability of train-test data points for graph based models. We show that for node level tasks the inductive splits (random or snowball sampled) break exchangeability, limiting the applicability of standard bounds for membership advantage of differential private models.


Key Contributions

  • Formalizes membership inference over node-neighbourhood tuples for GNNs, distinguishing graph-specific MI structure from non-graph settings.
  • Empirically demonstrates that inference-time edge access dominates MI risk — more so than the generalization gap, which is shown to be an incomplete proxy for MI vulnerability.
  • Shows that inductive graph splits (random or snowball) break statistical exchangeability of train-test nodes, limiting applicability of standard differential privacy membership advantage bounds to GNNs.

🛡️ Threat Analysis

Membership Inference Attack

The paper's primary contribution is a rigorous empirical and theoretical analysis of membership inference risk in GNNs, formalizing MI over node-neighbourhood tuples and examining how graph construction and inference-time edge access affect membership advantage.


Details

Domains
graph
Model Types
gnn
Threat Tags
inference_timetraining_timeblack_box
Applications
node classificationlink prediction