LLM in the Middle: A Systematic Review of Threats and Mitigations to Real-World LLM-based Systems
Vitor Hugo Galhardo Moia 1, Igor Jochem Sanz 1, Gabriel Antonio Fontes Rebello 1, Rodrigo Duarte de Meneses 1, Briland Hitaj 2, Ulf Lindqvist 2
Published on arXiv
2509.10682
Data Poisoning Attack
OWASP ML Top 10 — ML02
AI Supply Chain Attacks
OWASP ML Top 10 — ML06
Prompt Injection
OWASP LLM Top 10 — LLM01
Sensitive Information Disclosure
OWASP LLM Top 10 — LLM06
Insecure Plugin Design
OWASP LLM Top 10 — LLM07
Key Finding
Provides a comprehensive threat taxonomy and defense mapping for LLM-based systems, classifying threats by severity across development and operational lifecycle phases to guide consumers and vendors in risk mitigation.
The success and wide adoption of generative AI (GenAI), particularly large language models (LLMs), has attracted the attention of cybercriminals seeking to abuse models, steal sensitive data, or disrupt services. Moreover, providing security to LLM-based systems is a great challenge, as both traditional threats to software applications and threats targeting LLMs and their integration must be mitigated. In this survey, we shed light on security and privacy concerns of such LLM-based systems by performing a systematic review and comprehensive categorization of threats and defensive strategies considering the entire software and LLM life cycles. We analyze real-world scenarios with distinct characteristics of LLM usage, spanning from development to operation. In addition, threats are classified according to their severity level and to which scenarios they pertain, facilitating the identification of the most relevant threats. Recommended defense strategies are systematically categorized and mapped to the corresponding life cycle phase and possible attack strategies they attenuate. This work paves the way for consumers and vendors to understand and efficiently mitigate risks during integration of LLMs in their respective solutions or organizations. It also enables the research community to benefit from the discussion of open challenges and edge cases that may hinder the secure and privacy-preserving adoption of LLM-based systems.
Key Contributions
- Systematic categorization of threats to LLM-based systems across the full software and LLM lifecycle (development to operation), classified by severity and applicable scenario.
- Comprehensive mapping of defensive strategies to corresponding lifecycle phases and attack vectors they mitigate.
- Real-world scenario analysis covering distinct LLM usage patterns, with identification of open challenges for secure and privacy-preserving LLM integration.
🛡️ Threat Analysis
Survey explicitly covers data poisoning and training-time attacks on LLMs as part of its lifecycle-spanning threat taxonomy, including LLM03 (co-tagged as training data poisoning).
The survey covers supply chain and integration threats across LLM development and deployment lifecycle, including risks from third-party models, datasets, and software components.