Private-RAG: Answering Multiple Queries with LLMs while Keeping Your Data Private
Ruihan Wu 1, Erchi Wang 1, Zhiyuan Zhang 2, Yu-Xiang Wang 1
Published on arXiv
2511.07637
Sensitive Information Disclosure
OWASP LLM Top 10 — LLM06
Key Finding
Proposed DP-RAG algorithms scale to hundreds of queries within ε≈10 while maintaining meaningful answer utility across multiple LLMs and datasets.
MURAG / MURAG-ADA
Novel technique introduced
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving documents from an external corpus at inference time. When this corpus contains sensitive information, however, unprotected RAG systems are at risk of leaking private information. Prior work has introduced differential privacy (DP) guarantees for RAG, but only in single-query settings, which fall short of realistic usage. In this paper, we study the more practical multi-query setting and propose two DP-RAG algorithms. The first, MURAG, leverages an individual privacy filter so that the accumulated privacy loss only depends on how frequently each document is retrieved rather than the total number of queries. The second, MURAG-ADA, further improves utility by privately releasing query-specific thresholds, enabling more precise selection of relevant documents. Our experiments across multiple LLMs and datasets demonstrate that the proposed methods scale to hundreds of queries within a practical DP budget ($\varepsilon\approx10$), while preserving meaningful utility.
Key Contributions
- MURAG: a multi-query DP-RAG algorithm using an individual privacy filter so accumulated privacy loss scales with per-document retrieval frequency rather than total query count
- MURAG-ADA: extends MURAG by privately releasing query-specific retrieval thresholds to improve utility while preserving DP guarantees
- Empirical validation showing the methods scale to hundreds of queries within a practical DP budget (ε≈10) across multiple LLMs and datasets