defense arXiv Feb 24, 2026 · 5w ago
Inderjeet Singh, Vikas Pahuja, Aishvariya Priya Rathina Sabapathy et al. · Fujitsu Research of Europe · Fujitsu Limited
Stateful POMDP-based defense detects distributed multi-stage prompt injections in multimodal agentic RAG via LLM belief-state tracking
Input Manipulation Attack Prompt Injection multimodalnlp
Current stateless defences for multimodal agentic RAG fail to detect adversarial strategies that distribute malicious semantics across retrieval, planning, and generation components. We formulate this security challenge as a Partially Observable Markov Decision Process (POMDP), where adversarial intent is a latent variable inferred from noisy multi-stage observations. We introduce MMA-RAG^T, an inference-time control framework governed by a Modular Trust Agent (MTA) that maintains an approximate belief state via structured LLM reasoning. Operating as a model-agnostic overlay, MMA-RAGT mediates a configurable set of internal checkpoints to enforce stateful defence-in-depth. Extensive evaluation on 43,774 instances demonstrates a 6.50x average reduction factor in Attack Success Rate relative to undefended baselines, with negligible utility cost. Crucially, a factorial ablation validates our theoretical bounds: while statefulness and spatial coverage are individually necessary (26.4 pp and 13.6 pp gains respectively), stateless multi-point intervention can yield zero marginal benefit under homogeneous stateless filtering when checkpoint detections are perfectly correlated.
llm vlm transformer Fujitsu Research of Europe · Fujitsu Limited