attack 2025

BadGraph: A Backdoor Attack Against Latent Diffusion Model for Text-Guided Graph Generation

Liang Ye , Shengqin Chen , Jiazhu Dai

0 citations · 64 references · arXiv

α

Published on arXiv

2510.20792

Model Poisoning

OWASP ML Top 10 — ML10

Key Finding

A 24% poisoning rate achieves over 80% attack success rate on text-guided molecular graph generation with negligible degradation on benign inputs.

BadGraph

Novel technique introduced


The rapid progress of graph generation has raised new security concerns, particularly regarding backdoor vulnerabilities. While prior work has explored backdoor attacks in image diffusion and unconditional graph generation, conditional, especially text-guided graph generation remains largely unexamined. This paper proposes BadGraph, a backdoor attack method against latent diffusion models for text-guided graph generation. BadGraph leverages textual triggers to poison training data, covertly implanting backdoors that induce attacker-specified subgraphs during inference when triggers appear, while preserving normal performance on clean inputs. Extensive experiments on four benchmark datasets (PubChem, ChEBI-20, PCDes, MoMu) demonstrate the effectiveness and stealth of the attack: less than 10% poisoning rate can achieves 50% attack success rate, while 24% suffices for over 80% success rate, with negligible performance degradation on benign samples. Ablation studies further reveal that the backdoor is implanted during VAE and diffusion training rather than pretraining. These findings reveal the security vulnerabilities in latent diffusion models of text-guided graph generation, highlight the serious risks in models' applications such as drug discovery and underscore the need for robust defenses against the backdoor attack in such diffusion models.


Key Contributions

  • First backdoor attack targeting latent diffusion models for text-guided graph generation, using textual triggers to poison training data and covertly implant attacker-specified subgraphs
  • Demonstrates high attack efficacy: <10% poisoning rate achieves 50% ASR; 24% achieves >80% ASR with negligible benign performance degradation across four molecular graph datasets
  • Ablation study revealing that the backdoor is implanted during VAE and diffusion training phases rather than during pretraining

🛡️ Threat Analysis

Model Poisoning

BadGraph injects trigger-activated hidden behavior into latent diffusion models: textual triggers in training data cause the model to generate attacker-specified subgraphs at inference time while behaving normally on clean inputs — a textbook backdoor/trojan attack.


Details

Domains
graphgenerative
Model Types
diffusiongnntransformer
Threat Tags
black_boxtraining_timetargeted
Datasets
PubChemChEBI-20PCDesMoMu
Applications
text-guided graph generationmolecular designdrug discovery