attack 2026

ER-MIA: Black-Box Adversarial Memory Injection Attacks on Long-Term Memory-Augmented Large Language Models

Mitchell Piehl 1, Zhaohan Xi 2, Zuobin Xiong 3, Pan He 4, Muchao Ye 1

0 citations · 39 references

α

Published on arXiv

2602.15344

Input Manipulation Attack

OWASP ML Top 10 — ML01

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Similarity-based retrieval in long-term memory-augmented LLMs constitutes a universal, system-level vulnerability exploitable under black-box access, with adversarially injected memories severely impairing multi-hop and temporal reasoning across diverse LLM and memory system configurations.

ER-MIA

Novel technique introduced


Large language models (LLMs) are increasingly augmented with long-term memory systems to overcome finite context windows and enable persistent reasoning across interactions. However, recent research finds that LLMs become more vulnerable because memory provides extra attack surfaces. In this paper, we present the first systematic study of black-box adversarial memory injection attacks that target the similarity-based retrieval mechanism in long-term memory-augmented LLMs. We introduce ER-MIA, a unified framework that exposes this vulnerability and formalizes two realistic attack settings: content-based attacks and question-targeted attacks. In these settings, ER-MIA includes an arsenal of composable attack primitives and ensemble attacks that achieve high success rates under minimal attacker assumptions. Extensive experiments across multiple LLMs and long-term memory systems demonstrate that similarity-based retrieval constitutes a fundamental and system-level vulnerability, revealing security risks that persist across memory designs and application scenarios.


Key Contributions

  • First systematic study of black-box adversarial memory injection attacks (AMIAs) against similarity-based retrieval in dynamic long-term memory-augmented LLMs, under realistic minimal-attacker-knowledge assumptions
  • ER-MIA framework formalizing two attack settings (content-based and question-targeted) with composable attack primitives and ensemble attacks requiring no access to model parameters or retrieval system internals
  • Empirical demonstration that similarity-based retrieval is a fundamental, system-level vulnerability that persists across multiple LLM architectures and long-term memory system designs

🛡️ Threat Analysis

Input Manipulation Attack

ER-MIA crafts adversarial content specifically engineered to be embedding-close to legitimate memories, exploiting the similarity-based retrieval mechanism — this is adversarial content manipulation targeting an LLM-integrated retrieval system (analogous to adversarial RAG poisoning), which the guidelines explicitly include as ML01.


Details

Domains
nlp
Model Types
llm
Threat Tags
black_boxinference_timetargeted
Applications
memory-augmented llmspersistent ai assistantslong-horizon llm agentsretrieval-augmented generation