defense 2025

Unifying Adversarial Perturbation for Graph Neural Networks

Jinluan Yang , Ruihao Zhang , Zhengyu Chen , Fei Wu , Kun Kuang

0 citations

α

Published on arXiv

2509.00387

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

PerturbEmbedding outperforms existing adversarial perturbation methods for GNNs by operating directly on hidden embeddings, improving both robustness and generalization across diverse backbone models and datasets.

PerturbEmbedding

Novel technique introduced


This paper studies the vulnerability of Graph Neural Networks (GNNs) to adversarial attacks on node features and graph structure. Various methods have implemented adversarial training to augment graph data, aiming to bolster the robustness and generalization of GNNs. These methods typically involve applying perturbations to the node feature, weights, or graph structure and subsequently minimizing the loss by learning more robust graph model parameters under the adversarial perturbations. Despite the effectiveness of adversarial training in enhancing GNNs' robustness and generalization abilities, its application has been largely confined to specific datasets and GNN types. In this paper, we propose a novel method, PerturbEmbedding, that integrates adversarial perturbation and training, enhancing GNNs' resilience to such attacks and improving their generalization ability. PerturbEmbedding performs perturbation operations directly on every hidden embedding of GNNs and provides a unified framework for most existing perturbation strategies/methods. We also offer a unified perspective on the forms of perturbations, namely random and adversarial perturbations. Through experiments on various datasets using different backbone models, we demonstrate that PerturbEmbedding significantly improves both the robustness and generalization abilities of GNNs, outperforming existing methods. The rejection of both random (non-targeted) and adversarial (targeted) perturbations further enhances the backbone model's performance.


Key Contributions

  • PerturbEmbedding: a unified adversarial training framework that applies perturbations directly to every hidden embedding of GNNs rather than node features, weights, or graph structure
  • Unified theoretical perspective showing that most existing perturbation strategies (PerturbNode, PerturbWeight, PerturbEdge) are special cases of the embedding-level perturbation framework
  • Empirical demonstration that rejecting both random (untargeted) and adversarial (targeted) perturbations consistently improves GNN robustness and generalization across heterophilous and homophilous datasets

🛡️ Threat Analysis

Input Manipulation Attack

Paper directly addresses adversarial perturbation attacks on GNNs (input manipulation of node features and graph structure at inference time) and proposes PerturbEmbedding as an adversarial training defense to improve robustness against both targeted and untargeted adversarial attacks.


Details

Domains
graph
Model Types
gnn
Threat Tags
white_boxinference_timetargeteduntargeteddigital
Datasets
heterophilous graph datasetshomophilous graph datasets
Applications
node classificationgraph classification