Robustness in Text-Attributed Graph Learning: Insights, Trade-offs, and New Defenses
Runlin Lei 1, Lu Yi 1, Mingguo He 2, Pengyu Qiu 3, Zhewei Wei 1, Yongchao Liu 3, Chuntao Hong 3
Published on arXiv
2510.17185
Input Manipulation Attack
OWASP ML Top 10 — ML01
Data Poisoning Attack
OWASP ML Top 10 — ML02
Key Finding
GraphLLMs are particularly vulnerable to training data corruption, while GNNs and RGNNs exhibit inherent trade-offs between textual and structural robustness; SFT-auto resolves these trade-offs with a single balanced model.
SFT-auto
Novel technique introduced
While Graph Neural Networks (GNNs) and Large Language Models (LLMs) are powerful approaches for learning on Text-Attributed Graphs (TAGs), a comprehensive understanding of their robustness remains elusive. Current evaluations are fragmented, failing to systematically investigate the distinct effects of textual and structural perturbations across diverse models and attack scenarios. To address these limitations, we introduce a unified and comprehensive framework to evaluate robustness in TAG learning. Our framework evaluates classical GNNs, robust GNNs (RGNNs), and GraphLLMs across ten datasets from four domains, under diverse text-based, structure-based, and hybrid perturbations in both poisoning and evasion scenarios. Our extensive analysis reveals multiple findings, among which three are particularly noteworthy: 1) models have inherent robustness trade-offs between text and structure, 2) the performance of GNNs and RGNNs depends heavily on the text encoder and attack type, and 3) GraphLLMs are particularly vulnerable to training data corruption. To overcome the identified trade-offs, we introduce SFT-auto, a novel framework that delivers superior and balanced robustness against both textual and structural attacks within a single model. Our work establishes a foundation for future research on TAG security and offers practical solutions for robust TAG learning in adversarial environments. Our code is available at: https://github.com/Leirunlin/TGRB.
Key Contributions
- Unified robustness evaluation framework (TGRB) for text-attributed graph learning, covering GNNs, RGNNs, and GraphLLMs across 10 datasets under textual, structural, and hybrid perturbations in both poisoning and evasion settings.
- Systematic empirical findings: models exhibit inherent text-vs-structure robustness trade-offs, GNN performance is highly sensitive to the text encoder choice, and GraphLLMs are disproportionately vulnerable to training data corruption.
- SFT-auto, a novel supervised fine-tuning framework that achieves balanced robustness against both textual and structural adversarial attacks within a single model.
🛡️ Threat Analysis
Extensively evaluates evasion attacks — inference-time textual and structural perturbations on graph inputs — and defends against them via SFT-auto, directly mapping to input manipulation at inference time.
Evaluates poisoning scenarios where training data (node text attributes and graph structure) is corrupted; finds GraphLLMs are particularly vulnerable to training data corruption — a core data poisoning threat.