Adversarial Bug Reports as a Security Risk in Language Model-Based Automated Program Repair
Piotr Przymus 1, Andreas Happe 2, Jürgen Cito 2
Published on arXiv
2509.05372
Prompt Injection
OWASP LLM Top 10 — LLM01
Key Finding
90% of adversarial bug reports triggered attacker-aligned patches; best pre-repair filter (LlamaGuard variant) blocked only 47%, and post-repair static analysis was effective in just 58% of cases
Large Language Model (LLM) - based Automated Program Repair (APR) systems are increasingly integrated into modern software development workflows, offering automated patches in response to natural language bug reports. However, this reliance on untrusted user input introduces a novel and underexplored attack surface. In this paper, we investigate the security risks posed by adversarial bug reports -- realistic-looking issue submissions crafted to mislead APR systems into producing insecure or harmful code changes. We develop a comprehensive threat model and conduct an empirical study to evaluate the vulnerability of APR systems to such attacks. Our demonstration comprises 51 adversarial bug reports generated across a spectrum of strategies, ranging from manual curation to fully automated pipelines. We test these against a leading LLM-based APR system and assess both pre-repair defenses (e.g., LlamaGuard variants, PromptGuard variants, Granite-Guardian, and custom LLM filters) and post-repair detectors (GitHub Copilot, CodeQL). Our findings show that current defenses are insufficient: 90% of crafted bug reports triggered attacker-aligned patches. The best pre-repair filter blocked only 47%, while post-repair analysis -- often requiring human oversight -- was effective in just 58% of cases. To support scalable security testing, we introduce a prototype framework for automating the generation of adversarial bug reports. Our analysis exposes a structural asymmetry: generating adversarial inputs is inexpensive, while detecting or mitigating them remains costly and error-prone. We conclude with recommendations for improving the robustness of APR systems against adversarial misuse and highlight directions for future work on secure APR.
Key Contributions
- Formal threat model for adversarial exploitation of bug-reporting interfaces in LLM-based APR workflows, covering vulnerability injection, data exfiltration, and CI denial-of-service
- Empirical study with 51 adversarial bug reports (manual to fully automated) showing 90% attacker-aligned patch generation against a leading APR system, with best pre-repair filter blocking only 47%
- Open-source prototype framework for automating adversarial bug report generation to support scalable APR security testing