defense 2026

Adaptive Defense Orchestration for RAG: A Sentinel-Strategist Architecture against Multi-Vector Attacks

Pranav Pallerla ¹, Wilson Naik Bhukya ², Bharath Vemula ¹, Charan Ramtej Kodi ¹

¹ University of Hyderabad

² Purdue University

0 citations

Published on arXiv

2604.20932

Membership Inference Attack

OWASP ML Top 10 — ML04

Data Poisoning Attack

OWASP ML Top 10 — ML02

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Key Finding

Eliminates membership inference leakage while restoring contextual recall to 75%+ of undefended baseline (vs 40%+ degradation from static full-defense stack); reduces data poisoning attack success to near zero

ADO (Adaptive Defense Orchestration) / Sentinel-Strategist Architecture

Novel technique introduced

Retrieval-augmented generation (RAG) systems are increasingly deployed in sensitive domains such as healthcare and law, where they rely on private, domain-specific knowledge. This capability introduces significant security risks, including membership inference, data poisoning, and unintended content leakage. A straightforward mitigation is to enable all relevant defenses simultaneously, but doing so incurs a substantial utility cost. In our experiments, an always-on defense stack reduces contextual recall by more than 40%, indicating that retrieval degradation is the primary failure mode. To mitigate this trade-off in RAG systems, we propose the Sentinel-Strategist architecture, a context-aware framework for risk analysis and defense selection. A Sentinel detects anomalous retrieval behavior, after which a Strategist selectively deploys only the defenses warranted by the query context. Evaluated across three benchmark datasets and five orchestration models, ADO is shown to eliminate MBA-style membership inference leakage while substantially recovering retrieval utility relative to a fully static defense stack, approaching undefended baseline levels. Under data poisoning, the strongest ADO variants reduce attack success to near zero while restoring contextual recall to more than 75% of the undefended baseline, although robustness remains sensitive to model choice. Overall, these findings show that adaptive, query-aware defense can substantially reduce the security-utility trade-off in RAG systems.

Key Contributions

Sentinel-Strategist architecture for adaptive, query-aware defense orchestration in RAG systems
First empirical evaluation of RAG under concurrent multi-vector attacks (membership inference + data poisoning + content leakage)
Demonstrates that adaptive defense recovers 75%+ contextual recall while reducing attack success to near zero, compared to 40%+ utility loss from static always-on defenses

🛡️ Threat Analysis

Data Poisoning Attack

Paper addresses data poisoning attacks where adversaries inject malicious documents into the RAG knowledge base to manipulate retrieval and generation. TrustRAG-style clustering defense filters poisoned documents, and ADO reduces attack success to near zero.

Membership Inference Attack

Paper addresses membership inference attacks (MIA) on RAG knowledge bases where adversaries probe to determine if specific documents are present. The Sentinel-Strategist architecture is evaluated against MBA-style membership inference, achieving elimination of leakage.

Details

Domains

nlp

Model Types

llmtransformer

Threat Tags

inference_timetraining_time

Datasets

three benchmark datasets (specific names not mentioned in provided text)

Applications

retrieval-augmented generationhealthcare rag systemslegal analysis ragenterprise knowledge management

Read PDF arXiv

Adaptive Defense Orchestration for RAG: A Sentinel-Strategist Architecture against Multi-Vector Attacks

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Provably Secure Retrieval-Augmented Generation

STaR: Sensitive Trajectory Regulation for Unlearning in Large Reasoning Models

PrivMedChat: End-to-End Differentially Private RLHF for Medical Dialogue Systems

Public Data Assisted Differentially Private In-Context Learning

Combating Data Laundering in LLM Training

Quantization-Robust LLM Unlearning via Low-Rank Adaptation

Protecting Private Code in IDE Autocomplete using Differential Privacy

LLM-CEG: Extending the Classification Error Gauge Framework for Privacy Auditing of Large Language Models