survey 2026

Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks

0 citations

Published on arXiv

2603.21654

Prompt Injection

OWASP LLM Top 10 — LLM01

Training Data Poisoning

OWASP LLM Top 10 — LLM03

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Key Finding

Provides unified analysis of RAG pipeline security threats and consolidates authoritative test datasets, security standards, and evaluation frameworks for future RAG security research

Retrieval-Augmented Generation (RAG) significantly mitigates the hallucinations and domain knowledge deficiency in large language models by incorporating external knowledge bases. However, the multi-module architecture of RAG introduces complex system-level security vulnerabilities. Guided by the RAG workflow, this paper analyzes the underlying vulnerability mechanisms and systematically categorizes core threat vectors such as data poisoning, adversarial attacks, and membership inference attacks. Based on this threat assessment, we construct a taxonomy of RAG defense technologies from a dual perspective encompassing both input and output stages. The input-side analysis reviews data protection mechanisms including dynamic access control, homomorphic encryption retrieval, and adversarial pre-filtering. The output-side examination summarizes advanced leakage prevention techniques such as federated learning isolation, differential privacy perturbation, and lightweight data sanitization. To establish a unified benchmark for future experimental design, we consolidate authoritative test datasets, security standards, and evaluation frameworks. To the best of our knowledge, this paper presents the first end-to-end survey dedicated to the security of RAG systems. Distinct from existing literature that isolates specific vulnerabilities, we systematically map the entire pipeline-providing a unified analysis of threat models, defense mechanisms, and evaluation benchmarks. By enabling deep insights into potential risks, this work seeks to foster the development of highly robust and trustworthy next-generation RAG systems.

Key Contributions

First comprehensive end-to-end survey of RAG system security covering threats, defenses, and benchmarks
Systematic taxonomy of RAG-specific vulnerabilities including data poisoning, adversarial attacks, and membership inference mapped to RAG workflow stages
Dual-perspective defense framework analyzing input-side protections (access control, encryption, pre-filtering) and output-side techniques (federated learning, differential privacy, sanitization)

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

training_timeinference_time

Applications

retrieval-augmented generation systemsllm-based question answering

Read PDF arXiv

Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

A Survey on Data Security in Large Language Models

SoK: Privacy Risks and Mitigations in Retrieval-Augmented Generation Systems

IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs

Chain-of-Authorization: Internalizing Authorization into Large Language Models via Reasoning Trajectories

SD-RAG: A Prompt-Injection-Resilient Framework for Selective Disclosure in Retrieval-Augmented Generation

Machine Unlearning Meets Adversarial Robustness via Constrained Interventions on LLMs

Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities

Dual-Space Smoothness for Robust and Balanced LLM Unlearning