Multi-Targeted Graph Backdoor Attack

Graph neural network (GNN) have demonstrated exceptional performance in solving critical problems across diverse domains yet remain susceptible to backdoor attacks. Existing studies on backdoor attack for graph classification are limited to single target attack using subgraph replacement based mechanism where the attacker implants only one trigger into the GNN model. In this paper, we introduce the first multi-targeted backdoor attack for graph classification task, where multiple triggers simultaneously redirect predictions to different target labels. Instead of subgraph replacement, we propose subgraph injection which preserves the structure of the original graphs while poisoning the clean graphs. Extensive experiments demonstrate the efficacy of our approach, where our attack achieves high attack success rates for all target labels with minimal impact on the clean accuracy. Experimental results on five dataset demonstrate the superior performance of our attack framework compared to the conventional subgraph replacement-based attack. Our analysis on four GNN models confirms the generalization capability of our attack which is effective regardless of the GNN model architectures and training parameters settings. We further investigate the impact of the attack design parameters including injection methods, number of connections, trigger sizes, trigger edge density and poisoning ratios. Additionally, our evaluation against state-of-the-art defenses (randomized smoothing and fine-pruning) demonstrates the robustness of our proposed multi-target attacks. This work highlights the GNN vulnerability against multi-targeted backdoor attack in graph classification task. Our source codes will be available at https://github.com/SiSL-URI/Multi-Targeted-Graph-Backdoor-Attack.

Key Contributions

First multi-targeted backdoor attack for graph classification, allowing a single poisoned model to redirect predictions to multiple distinct target labels using separate triggers
Subgraph injection mechanism that preserves original graph structure while embedding triggers, contrasting with destructive subgraph replacement used in prior work
Demonstrated robustness of the attack against state-of-the-art GNN defenses (randomized smoothing and fine-pruning) across four GNN architectures and five datasets

🛡️ Threat Analysis

Data Poisoning Attack

The attack mechanism operates at training time via subgraph injection into clean training graphs (data poisoning), which is the vehicle for delivering the backdoor behavior.

Model Poisoning

Core contribution is a backdoor/trojan attack embedding multiple hidden triggers in GNN models that activate for specific target labels while maintaining clean accuracy — classic ML10 backdoor insertion with a novel multi-target extension.