attack 2026

Poisoning the Inner Prediction Logic of Graph Neural Networks for Clean-Label Backdoor Attacks

Yuxiang Zhang , Bin Ma , Enyan Dai

0 citations

α

Published on arXiv

2603.05004

Model Poisoning

OWASP ML Top 10 — ML10

Key Finding

BA-Logic surpasses state-of-the-art graph backdoor attack baselines under the clean-label setting by successfully poisoning GNN prediction logic without altering any training labels.

BA-Logic

Novel technique introduced


Graph Neural Networks (GNNs) have achieved remarkable results in various tasks. Recent studies reveal that graph backdoor attacks can poison the GNN model to predict test nodes with triggers attached as the target class. However, apart from injecting triggers to training nodes, these graph backdoor attacks generally require altering the labels of trigger-attached training nodes into the target class, which is impractical in real-world scenarios. In this work, we focus on the clean-label graph backdoor attack, a realistic but understudied topic where training labels are not modifiable. According to our preliminary analysis, existing graph backdoor attacks generally fail under the clean-label setting. Our further analysis identifies that the core failure of existing methods lies in their inability to poison the prediction logic of GNN models, leading to the triggers being deemed unimportant for prediction. Therefore, we study a novel problem of effective clean-label graph backdoor attacks by poisoning the inner prediction logic of GNN models. We propose BA-Logic to solve the problem by coordinating a poisoned node selector and a logic-poisoning trigger generator. Extensive experiments on real-world datasets demonstrate that our method effectively enhances the attack success rate and surpasses state-of-the-art graph backdoor attack competitors under clean-label settings. Our code is available at https://anonymous.4open.science/r/BA-Logic


Key Contributions

  • Identifies the root cause of existing backdoor attacks failing under clean-label settings: inability to poison the inner prediction logic of GNNs, causing triggers to be deemed unimportant.
  • Proposes BA-Logic, a clean-label graph backdoor attack combining a poisoned node selector and a logic-poisoning trigger generator to force GNNs to rely on triggers for prediction.
  • Demonstrates state-of-the-art attack success rates on real-world graph datasets under the clean-label constraint where training labels cannot be modified.

🛡️ Threat Analysis

Model Poisoning

BA-Logic is a backdoor/trojan attack: it embeds hidden, targeted malicious behavior (forcing target-class prediction when a trigger is attached) into GNN models. The model behaves normally on clean inputs, activating only with specific triggers. The clean-label setting (no label modification) is a novel contribution to backdoor methodology, but the core threat is classic ML10 — trojan insertion with trigger-based activation.


Details

Domains
graph
Model Types
gnn
Threat Tags
white_boxtraining_timetargeteddigital
Applications
node classificationgraph neural networks