attack 2026

Graph-Aware Text-Only Backdoor Poisoning for Text-Attributed Graphs

Qi Luo , Minghui Xu , Dongxiao Yu , Xiuzhen Cheng

0 citations

α

Published on arXiv

2603.20339

Model Poisoning

OWASP ML Top 10 — ML10

Data Poisoning Attack

OWASP ML Top 10 — ML02

Key Finding

Achieves 100.00%, 99.85%, and 99.96% attack success rate on Cora, Pubmed, and Arxiv respectively while maintaining clean accuracy and surviving common defenses

TAGBD

Novel technique introduced


Many learning systems now use graph data in which each node also contains text, such as papers with abstracts or users with posts. Because these texts often come from open platforms, an attacker may be able to quietly poison a small part of the training data and later make the model produce wrong predictions on demand. This paper studies that risk in a realistic setting where the attacker edits only node text and does not change the graph structure. We propose TAGBD, a text-only backdoor attack for text-attributed graphs. TAGBD first finds training nodes that are easier to influence, then generates natural-looking trigger text with the help of a shadow graph model, and finally injects the trigger by either replacing the original text or appending a short phrase. Experiments on three benchmark datasets show that the attack is highly effective, transfers across different graph models, and remains strong under common defenses. These results demonstrate that text alone is a practical attack channel in graph learning systems and suggest that future defenses should inspect both graph links and node content.


Key Contributions

  • First text-only backdoor attack for text-attributed graphs that preserves graph topology
  • Graph-aware trigger generation framework (TextTrojan) using uncertainty-guided node selection and shadow GNN training
  • Two injection strategies (overwriting vs. appending) that trade off attack strength and stealth

🛡️ Threat Analysis

Data Poisoning Attack

The attack vector is training data poisoning — the attacker modifies node texts in the training graph to corrupt the learned model. The paper explicitly describes this as 'training-time poisoning' and the attacker 'poisons a small part of the training data' to implant the backdoor.

Model Poisoning

The paper proposes TAGBD, a backdoor attack that implants hidden malicious behavior (trigger-target association) in graph neural networks during training. The attack uses text triggers that activate targeted misclassification while maintaining normal performance on clean data — this is the core definition of a backdoor/trojan attack.


Details

Domains
nlpgraph
Model Types
gnntransformer
Threat Tags
training_timetargeted
Datasets
CoraPubmedArxiv
Applications
citation networkssocial networksproduct graphsnode classification