attack 2026

ER-MIA: Black-Box Adversarial Memory Injection Attacks on Long-Term Memory-Augmented Large Language Models

0 citations · 39 references

Published on arXiv

2602.15344

Input Manipulation Attack

OWASP ML Top 10 — ML01

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Similarity-based retrieval in long-term memory-augmented LLMs constitutes a universal, system-level vulnerability exploitable under black-box access, with adversarially injected memories severely impairing multi-hop and temporal reasoning across diverse LLM and memory system configurations.

ER-MIA

Novel technique introduced

Large language models (LLMs) are increasingly augmented with long-term memory systems to overcome finite context windows and enable persistent reasoning across interactions. However, recent research finds that LLMs become more vulnerable because memory provides extra attack surfaces. In this paper, we present the first systematic study of black-box adversarial memory injection attacks that target the similarity-based retrieval mechanism in long-term memory-augmented LLMs. We introduce ER-MIA, a unified framework that exposes this vulnerability and formalizes two realistic attack settings: content-based attacks and question-targeted attacks. In these settings, ER-MIA includes an arsenal of composable attack primitives and ensemble attacks that achieve high success rates under minimal attacker assumptions. Extensive experiments across multiple LLMs and long-term memory systems demonstrate that similarity-based retrieval constitutes a fundamental and system-level vulnerability, revealing security risks that persist across memory designs and application scenarios.

Key Contributions

First systematic study of black-box adversarial memory injection attacks (AMIAs) against similarity-based retrieval in dynamic long-term memory-augmented LLMs, under realistic minimal-attacker-knowledge assumptions
ER-MIA framework formalizing two attack settings (content-based and question-targeted) with composable attack primitives and ensemble attacks requiring no access to model parameters or retrieval system internals
Empirical demonstration that similarity-based retrieval is a fundamental, system-level vulnerability that persists across multiple LLM architectures and long-term memory system designs

🛡️ Threat Analysis

Input Manipulation Attack

ER-MIA crafts adversarial content specifically engineered to be embedding-close to legitimate memories, exploiting the similarity-based retrieval mechanism — this is adversarial content manipulation targeting an LLM-integrated retrieval system (analogous to adversarial RAG poisoning), which the guidelines explicitly include as ML01.

Details

Domains

nlp

Model Types

llm

Threat Tags

black_boxinference_timetargeted

Applications

memory-augmented llmspersistent ai assistantslong-horizon llm agentsretrieval-augmented generation

Read PDF arXiv

ER-MIA: Black-Box Adversarial Memory Injection Attacks on Long-Term Memory-Augmented Large Language Models

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Controlling Output Rankings in Generative Engines for LLM-based Search

Overcoming the Retrieval Barrier: Indirect Prompt Injection in the Wild for LLM Systems

ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation

Trust Me, I Know This Function: Hijacking LLM Static Analysis using Bias

Layer-Wise Perturbations via Sparse Autoencoders for Adversarial Text Generation

Dynamic Target Attack

"Someone Hid It": Query-Agnostic Black-Box Attacks on LLM-Based Retrieval

One Leak Away: How Pretrained Model Exposure Amplifies Jailbreak Risks in Finetuned LLMs