defense 2026

Memory poisoning and secure multi-agent systems

Vicenç Torra 1, Maria Bras-Amorós 1,2

0 citations

α

Published on arXiv

2603.20357

Data Poisoning Attack

OWASP ML Top 10 — ML02

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Proposes cryptographic solutions for securing agent memory systems against poisoning attacks, with proof-of-concept implementation for local inference

Private Knowledge Retrieval for Semantic Memory

Novel technique introduced


Memory poisoning attacks for Agentic AI and multi-agent systems (MAS) have recently caught attention. It is partially due to the fact that Large Language Models (LLMs) facilitate the construction and deployment of agents. Different memory systems are being used nowadays in this context, including semantic, episodic, and short-term memory. This distinction between the different types of memory systems focuses mostly on their duration but also on their origin and their localization. It ranges from the short-term memory originated at the user's end localized in the different agents to the long-term consolidated memory localized in well established knowledge databases. In this paper, we first present the main types of memory systems, we then discuss the feasibility of memory poisoning attacks in these different types of memory systems, and we propose mitigation strategies. We review the already existing security solutions to mitigate some of the alleged attacks, and we discuss adapted solutions based on cryptography. We propose to implement local inference based on private knowledge retrieval as an example of mitigation strategy for memory poisoning for semantic memory. We also emphasize actual risks in relation to interactions between agents, which can cause memory poisoning. These latter risks are not so much studied in the literature and are difficult to formalize and solve. Thus, we contribute to the construction of agents that are secure by design.


Key Contributions

  • Taxonomy of memory poisoning attacks across semantic, episodic, and short-term memory systems in LLM-based agents
  • Cryptographic mitigation strategies including private knowledge retrieval for semantic memory protection
  • Analysis of inter-agent interaction risks that cause memory poisoning in multi-agent systems

🛡️ Threat Analysis

Data Poisoning Attack

Memory poisoning is a form of data poisoning where malicious agents corrupt the memory/knowledge bases that LLM-based agents use for decision-making. The paper discusses attacks where adversaries inject malicious information into agent memory systems (semantic, episodic, short-term) to corrupt behavior.


Details

Domains
nlp
Model Types
llm
Threat Tags
training_timeinference_time
Applications
multi-agent systemsagentic aillm-based agents