Analyzing Code Injection Attacks on LLM-based Multi-Agent Systems in Software Development
Brian Bowers 1, Smita Khapre 1, Jugal Kalita 2
Published on arXiv
2512.21818
Prompt Injection
OWASP LLM Top 10 — LLM01
Excessive Agency
OWASP LLM Top 10 — LLM08
Key Finding
Embedding poisonous few-shot examples in injected code increases attack success rate against the LLM security analysis agent from 0% to 71.95%
Poisonous Few-Shot Code Injection
Novel technique introduced
Agentic AI and Multi-Agent Systems are poised to dominate industry and society imminently. Powered by goal-driven autonomy, they represent a powerful form of generative AI, marking a transition from reactive content generation into proactive multitasking capabilities. As an exemplar, we propose an architecture of a multi-agent system for the implementation phase of the software engineering process. We also present a comprehensive threat model for the proposed system. We demonstrate that while such systems can generate code quite accurately, they are vulnerable to attacks, including code injection. Due to their autonomous design and lack of humans in the loop, these systems cannot identify and respond to attacks by themselves. This paper analyzes the vulnerability of multi-agent systems and concludes that the coder-reviewer-tester architecture is more resilient than both the coder and coder-tester architectures, but is less efficient at writing code. We find that by adding a security analysis agent, we mitigate the loss in efficiency while achieving even better resiliency. We conclude by demonstrating that the security analysis agent is vulnerable to advanced code injection attacks, showing that embedding poisonous few-shot examples in the injected code can increase the attack success rate from 0% to 71.95%.
Key Contributions
- Proposes and evaluates coder, coder-tester, and coder-reviewer-tester MAS architectures for SDLC implementation phase against code injection attacks
- Introduces a security analysis agent that improves resilience while recovering efficiency lost in the coder-reviewer-tester architecture
- Demonstrates that embedding poisonous few-shot examples in injected code bypasses the security analysis agent, raising attack success rate from 0% to 71.95%