Provably Secure Retrieval-Augmented Generation
Pengcheng Zhou , Yinglun Feng , Zhongliang Yang
Published on arXiv
2508.01084
Data Poisoning Attack
OWASP ML Top 10 — ML02
Sensitive Information Disclosure
OWASP LLM Top 10 — LLM06
Key Finding
The SAG framework provides formally proven confidentiality and integrity guarantees for RAG systems, effectively resisting data leakage and poisoning attacks across multiple benchmark datasets while maintaining retrieval and generation performance.
SAG (Secure Augmented Generation)
Novel technique introduced
Although Retrieval-Augmented Generation (RAG) systems have been widely applied, the privacy and security risks they face, such as data leakage and data poisoning, have not been systematically addressed yet. Existing defense strategies primarily rely on heuristic filtering or enhancing retriever robustness, which suffer from limited interpretability, lack of formal security guarantees, and vulnerability to adaptive attacks. To address these challenges, this paper proposes the first provably secure framework for RAG systems(SAG). Our framework employs a pre-storage full-encryption scheme to ensure dual protection of both retrieved content and vector embeddings, guaranteeing that only authorized entities can access the data. Through formal security proofs, we rigorously verify the scheme's confidentiality and integrity under a computational security model. Extensive experiments across multiple benchmark datasets demonstrate that our framework effectively resists a range of state-of-the-art attacks. This work establishes a theoretical foundation and practical paradigm for verifiably secure RAG systems, advancing AI-powered services toward formally guaranteed security.
Key Contributions
- First provably secure RAG framework (SAG) with formal confidentiality and integrity proofs under a computational security model
- Pre-storage full-encryption scheme protecting both retrieved text content and vector embeddings, ensuring only authorized entities can access or modify the knowledge base
- Empirical validation across diverse benchmark datasets showing effective resistance to state-of-the-art RAG attacks while preserving retrieval precision and generation quality
🛡️ Threat Analysis
Paper explicitly defends against data poisoning of the RAG knowledge base — malicious corpus providers injecting false or manipulative content — through cryptographic integrity verification with formal security proofs.