Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data
Zi Liang 1, Qingqing Ye 1, Xuan Liu 2, Yanyun Wang 3, Jianliang Xu 4, Haibo Hu 1
1 The Hong Kong Polytechnic University
2 University of California, San Diego
Published on arXiv
2509.23041
Data Poisoning Attack
OWASP ML Top 10 — ML02
Model Poisoning
OWASP ML Top 10 — ML10
Training Data Poisoning
OWASP LLM Top 10 — LLM03
Key Finding
VIA raises the attack success rate on downstream LLMs trained on synthetic data to levels comparable to those observed when directly poisoning the upstream model, succeeding where prior attacks fail due to distributional mismatch.
VIA (Virus Infection Attack)
Novel technique introduced
Synthetic data refers to artificial samples generated by models. While it has been validated to significantly enhance the performance of large language models (LLMs) during training and has been widely adopted in LLM development, potential security risks it may introduce remain uninvestigated. This paper systematically evaluates the resilience of synthetic-data-integrated training paradigm for LLMs against mainstream poisoning and backdoor attacks. We reveal that such a paradigm exhibits strong resistance to existing attacks, primarily thanks to the different distribution patterns between poisoning data and queries used to generate synthetic samples. To enhance the effectiveness of these attacks and further investigate the security risks introduced by synthetic data, we introduce a novel and universal attack framework, namely, Virus Infection Attack (VIA), which enables the propagation of current attacks through synthetic data even under purely clean queries. Inspired by the principles of virus design in cybersecurity, VIA conceals the poisoning payload within a protective "shell" and strategically searches for optimal hijacking points in benign samples to maximize the likelihood of generating malicious content. Extensive experiments on both data poisoning and backdoor attacks show that VIA significantly increases the presence of poisoning content in synthetic data and correspondingly raises the attack success rate (ASR) on downstream models to levels comparable to those observed in the poisoned upstream models.
Key Contributions
- Reveals that existing poisoning/backdoor attacks largely fail against synthetic-data-integrated LLM training due to distributional mismatch between poisoning data and clean generation queries.
- Proposes VIA (Virus Infection Attack), a universal framework that embeds poisoning payloads inside protective 'shells' within benign samples, enabling propagation through synthetic data even under purely clean queries.
- Introduces a hijacking point search strategy (HPS) to identify optimal positions in benign samples for payload injection, maximizing malicious content generation in synthetic outputs.
🛡️ Threat Analysis
VIA is fundamentally a training-data attack — it corrupts the synthetic data used to train downstream LLMs by injecting poisoning content into benign samples, causing biased or harmful model behavior.
The paper explicitly covers backdoor attacks as a primary scenario alongside data poisoning: VIA embeds triggered backdoor payloads that propagate through synthetic data, achieving high attack success rates on downstream models.