attack 2025

Stealthy Dual-Trigger Backdoors: Attacking Prompt Tuning in LM-Empowered Graph Foundation Models

Xiaoyu Xue 1, Yuni Lai 1, Chenxi Huang 1, Yulin Zhu 2, Gaolei Li 1, Xiaoge Zhang 1, Kai Zhou 3

0 citations · 49 references · arXiv

α

Published on arXiv

2510.14470

Model Poisoning

OWASP ML Top 10 — ML10

Transfer Learning Attack

OWASP ML Top 10 — ML07

Key Finding

Achieves outstanding attack success rates including highly concealed single-trigger scenarios while maintaining superior clean accuracy on text-attributed graphs during prompt tuning.

Dual-Trigger Backdoor Attack

Novel technique introduced


The emergence of graph foundation models (GFMs), particularly those incorporating language models (LMs), has revolutionized graph learning and demonstrated remarkable performance on text-attributed graphs (TAGs). However, compared to traditional GNNs, these LM-empowered GFMs introduce unique security vulnerabilities during the unsecured prompt tuning phase that remain understudied in current research. Through empirical investigation, we reveal a significant performance degradation in traditional graph backdoor attacks when operating in attribute-inaccessible constrained TAG systems without explicit trigger node attribute optimization. To address this, we propose a novel dual-trigger backdoor attack framework that operates at both text-level and struct-level, enabling effective attacks without explicit optimization of trigger node text attributes through the strategic utilization of a pre-established text pool. Extensive experimental evaluations demonstrate that our attack maintains superior clean accuracy while achieving outstanding attack success rates, including scenarios with highly concealed single-trigger nodes. Our work highlights critical backdoor risks in web-deployed LM-empowered GFMs and contributes to the development of more robust supervision mechanisms for open-source platforms in the era of foundation models.


Key Contributions

  • Reveals that traditional graph backdoor attacks degrade significantly in attribute-inaccessible TAG systems where trigger node attributes cannot be directly optimized
  • Proposes a dual-trigger backdoor framework operating at both text-level and structural-level using a pre-established text pool, bypassing the need for explicit trigger attribute optimization
  • Demonstrates high attack success rates with stealthy single-trigger nodes while maintaining clean accuracy on LM-empowered GFMs during prompt tuning

🛡️ Threat Analysis

Transfer Learning Attack

The attack specifically exploits the 'pre-train, prompt-tuning' paradigm by targeting the unsecured prompt tuning (fine-tuning) phase, and is designed to operate in scenarios where the attacker cannot directly access or optimize trigger node attributes — directly exploiting the gap between pre-training and fine-tuning distributions in a transfer learning workflow.

Model Poisoning

Proposes a backdoor attack that injects hidden dual triggers (text-level and structural-level) into GFMs, causing targeted misclassification when triggers are present while maintaining normal behavior otherwise — classic backdoor/trojan behavior.


Details

Domains
graphnlp
Model Types
gnnllmtransformer
Threat Tags
training_timetargeteddigitalblack_box
Applications
text-attributed graph learningnode classificationgraph foundation models