Robustness in Text-Attributed Graph Learning: Insights, Trade-offs, and New Defenses

While Graph Neural Networks (GNNs) and Large Language Models (LLMs) are powerful approaches for learning on Text-Attributed Graphs (TAGs), a comprehensive understanding of their robustness remains elusive. Current evaluations are fragmented, failing to systematically investigate the distinct effects of textual and structural perturbations across diverse models and attack scenarios. To address these limitations, we introduce a unified and comprehensive framework to evaluate robustness in TAG learning. Our framework evaluates classical GNNs, robust GNNs (RGNNs), and GraphLLMs across ten datasets from four domains, under diverse text-based, structure-based, and hybrid perturbations in both poisoning and evasion scenarios. Our extensive analysis reveals multiple findings, among which three are particularly noteworthy: 1) models have inherent robustness trade-offs between text and structure, 2) the performance of GNNs and RGNNs depends heavily on the text encoder and attack type, and 3) GraphLLMs are particularly vulnerable to training data corruption. To overcome the identified trade-offs, we introduce SFT-auto, a novel framework that delivers superior and balanced robustness against both textual and structural attacks within a single model. Our work establishes a foundation for future research on TAG security and offers practical solutions for robust TAG learning in adversarial environments. Our code is available at: https://github.com/Leirunlin/TGRB.

Key Contributions

Unified robustness evaluation framework (TGRB) for text-attributed graph learning, covering GNNs, RGNNs, and GraphLLMs across 10 datasets under textual, structural, and hybrid perturbations in both poisoning and evasion settings.
Systematic empirical findings: models exhibit inherent text-vs-structure robustness trade-offs, GNN performance is highly sensitive to the text encoder choice, and GraphLLMs are disproportionately vulnerable to training data corruption.
SFT-auto, a novel supervised fine-tuning framework that achieves balanced robustness against both textual and structural adversarial attacks within a single model.

🛡️ Threat Analysis

Input Manipulation Attack

Extensively evaluates evasion attacks — inference-time textual and structural perturbations on graph inputs — and defends against them via SFT-auto, directly mapping to input manipulation at inference time.

Data Poisoning Attack

Evaluates poisoning scenarios where training data (node text attributes and graph structure) is corrupted; finds GraphLLMs are particularly vulnerable to training data corruption — a core data poisoning threat.

Details

Domains

graphnlp

Model Types

gnnllmtransformer

Threat Tags

training_timeinference_timewhite_boxdigital

Datasets

10 datasets across 4 domains (specific names not listed in abstract/body excerpt)

Applications

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Adversarial Attacks and Defenses on Graph-aware Large Language Models (LLMs)

Unveiling the Vulnerability of Graph-LLMs: An Interpretable Multi-Dimensional Adversarial Attack on TAGs

Enhancing Robustness of Graph Neural Networks through p-Laplacian

DINA: A Dual Defense Framework Against Internal Noise and External Attacks in Natural Language Processing

Fixed-point graph convolutional networks against adversarial attacks

Reproducing HotFlip for Corpus Poisoning Attacks in Dense Retrieval

Are LLM-Enhanced Graph Neural Networks Robust against Poisoning Attacks?

GRAPHTEXTACK: A Realistic Black-Box Node Injection Attack on LLM-Enhanced GNNs