defense 2025

Pruning Graphs by Adversarial Robustness Evaluation to Strengthen GNN Defenses

Yongyu Wang

0 citations · 25 references · arXiv

α

Published on arXiv

2512.22128

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

The proposed edge pruning approach significantly enhances GNN defense capability in the high-perturbation regime across three architectures and three benchmark datasets.


Graph Neural Networks (GNNs) have emerged as a dominant paradigm for learning on graph-structured data, thanks to their ability to jointly exploit node features and relational information encoded in the graph topology. This joint modeling, however, also introduces a critical weakness: perturbations or noise in either the structure or the features can be amplified through message passing, making GNNs highly vulnerable to adversarial attacks and spurious connections. In this work, we introduce a pruning framework that leverages adversarial robustness evaluation to explicitly identify and remove fragile or detrimental components of the graph. By using robustness scores as guidance, our method selectively prunes edges that are most likely to degrade model reliability, thereby yielding cleaner and more resilient graph representations. We instantiate this framework on three representative GNN architectures and conduct extensive experiments on benchmarks. The experimental results show that our approach can significantly enhance the defense capability of GNNs in the high-perturbation regime.


Key Contributions

  • A graph pruning framework guided by adversarial robustness scores to selectively remove fragile or detrimental edges from the graph topology
  • Instantiation and evaluation of the framework on three representative GNN architectures (GCN, GAT, GraphSAGE)
  • Demonstrated significant improvement in GNN defense capability under high-perturbation regimes on standard benchmarks

🛡️ Threat Analysis

Input Manipulation Attack

The paper defends against adversarial perturbations to graph structure (edge insertions, deletions, rewiring) that cause GNN misclassification — a canonical input manipulation attack. The pruning framework explicitly targets edges most likely to degrade reliability under adversarial perturbation.


Details

Domains
graph
Model Types
gnn
Threat Tags
inference_timedigital
Datasets
CoraCiteseerPubMed
Applications
node classificationgraph-structured learning