Privacy Auditing of Multi-domain Graph Pre-trained Model under Membership Inference Attacks
Jiayi Luo 1, Qingyun Sun 1, Yuecen Wei 1, Haonan Yuan 1, Xingcheng Fu 2, Jianxin Li 1
Published on arXiv
2511.17989
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
MGP-MIA effectively mounts membership inference attacks against multi-domain graph pre-trained models despite their enhanced generalization, revealing significant privacy risks in graph foundation model pre-training pipelines.
MGP-MIA
Novel technique introduced
Multi-domain graph pre-training has emerged as a pivotal technique in developing graph foundation models. While it greatly improves the generalization of graph neural networks, its privacy risks under membership inference attacks (MIAs), which aim to identify whether a specific instance was used in training (member), remain largely unexplored. However, effectively conducting MIAs against multi-domain graph pre-trained models is a significant challenge due to: (i) Enhanced Generalization Capability: Multi-domain pre-training reduces the overfitting characteristics commonly exploited by MIAs. (ii) Unrepresentative Shadow Datasets: Diverse training graphs hinder the obtaining of reliable shadow graphs. (iii) Weakened Membership Signals: Embedding-based outputs offer less informative cues than logits for MIAs. To tackle these challenges, we propose MGP-MIA, a novel framework for Membership Inference Attacks against Multi-domain Graph Pre-trained models. Specifically, we first propose a membership signal amplification mechanism that amplifies the overfitting characteristics of target models via machine unlearning. We then design an incremental shadow model construction mechanism that builds a reliable shadow model with limited shadow graphs via incremental learning. Finally, we introduce a similarity-based inference mechanism that identifies members based on their similarity to positive and negative samples. Extensive experiments demonstrate the effectiveness of our proposed MGP-MIA and reveal the privacy risks of multi-domain graph pre-training.
Key Contributions
- Membership signal amplification mechanism that leverages machine unlearning to intensify overfitting characteristics in multi-domain graph pre-trained models, making them more susceptible to MIA
- Incremental shadow model construction that builds reliable shadow models from limited, heterogeneous shadow graphs via incremental learning
- Similarity-based inference mechanism that classifies members by comparing embedding similarity to positive and negative reference samples, overcoming weak logit-based signals
🛡️ Threat Analysis
The paper's core contribution is a novel MIA framework that determines whether specific graph instances were used during multi-domain pre-training. All three mechanisms (membership signal amplification, incremental shadow model construction, similarity-based inference) directly serve the binary membership inference goal.