attack 2025

Stealthy Yet Effective: Distribution-Preserving Backdoor Attacks on Graph Classification

Xiaobao Wang ^1,2, Ruoxiao Sun ¹, Yujun Zhang ¹, Bingdao Feng ¹, Dongxiao He ¹, Luzhi Wang ³, Di Jin ¹

¹ Tianjin University

² Guangdong Laboratory of Artificial Intelligence and Digital Economy

³ Dalian Maritime University

2 citations · 51 references · arXiv

Published on arXiv

2509.26032

Model Poisoning

OWASP ML Top 10 — ML10

Key Finding

DPSBA achieves high attack success rates while significantly reducing anomaly scores compared to SOTA graph backdoor methods, achieving a superior balance between effectiveness and detectability.

DPSBA

Novel technique introduced

Graph Neural Networks (GNNs) have demonstrated strong performance across tasks such as node classification, link prediction, and graph classification, but remain vulnerable to backdoor attacks that implant imperceptible triggers during training to control predictions. While node-level attacks exploit local message passing, graph-level attacks face the harder challenge of manipulating global representations while maintaining stealth. We identify two main sources of anomaly in existing graph classification backdoor methods: structural deviation from rare subgraph triggers and semantic deviation caused by label flipping, both of which make poisoned graphs easily detectable by anomaly detection models. To address this, we propose DPSBA, a clean-label backdoor framework that learns in-distribution triggers via adversarial training guided by anomaly-aware discriminators. DPSBA effectively suppresses both structural and semantic anomalies, achieving high attack success while significantly improving stealth. Extensive experiments on real-world datasets validate that DPSBA achieves a superior balance between effectiveness and detectability compared to state-of-the-art baselines.

Key Contributions

Identifies two sources of detectability in existing graph backdoor attacks — structural deviation (rare/unnatural subgraph triggers) and semantic deviation (label flipping) — and quantifies their impact via anomaly detection models.
Proposes DPSBA, a clean-label backdoor framework that learns in-distribution triggers via adversarial training guided by anomaly-aware discriminators, eliminating label flipping while suppressing distributional artifacts.
Demonstrates on real-world graph classification benchmarks that DPSBA achieves a superior stealth-effectiveness tradeoff compared to ER-B, GTA, and Motif baselines.

🛡️ Threat Analysis

Model Poisoning

DPSBA is a clean-label backdoor attack on GNNs: it implants a hidden trigger during training such that the model predicts the attacker's target class when the trigger subgraph is present, while behaving normally on clean graphs. This is the canonical ML10 threat — targeted, trigger-activated hidden behavior embedded at training time.

Details

Domains

graph

Model Types

gnn

Threat Tags

training_timetargeteddigital

Datasets

AIDSMUTAGPROTEINSCOLLAB

Applications

graph classification

Read PDF arXiv DOI Code

Stealthy Yet Effective: Distribution-Preserving Backdoor Attacks on Graph Classification

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Cross-Paradigm Graph Backdoor Attacks with Promptable Subgraph Triggers

Geometry-Aware Backdoor Attacks: Leveraging Curvature in Hyperbolic Embeddings

Poisoning the Inner Prediction Logic of Graph Neural Networks for Clean-Label Backdoor Attacks

HeteroHBA: A Generative Structure-Manipulating Backdoor Attack on Heterogeneous Graphs

Fine-tuning is Not Fine: Mitigating Backdoor Attacks in GNNs with Limited Clean Data

Multi-Targeted Graph Backdoor Attack

Towards Effective, Stealthy, and Persistent Backdoor Attacks Targeting Graph Foundation Models

IU: Imperceptible Universal Backdoor Attack