attack 2025

Exposing Privacy Risks in Graph Retrieval-Augmented Generation

Jiale Liu , Jiahao Zhang , Suhang Wang

0 citations

α

Published on arXiv

2508.17222

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Key Finding

Graph RAG systems are significantly more vulnerable to structured entity and relationship extraction than standard document-based RAG, despite offering reduced raw text leakage.


Retrieval-Augmented Generation (RAG) is a powerful technique for enhancing Large Language Models (LLMs) with external, up-to-date knowledge. Graph RAG has emerged as an advanced paradigm that leverages graph-based knowledge structures to provide more coherent and contextually rich answers. However, the move from plain document retrieval to structured graph traversal introduces new, under-explored privacy risks. This paper investigates the data extraction vulnerabilities of the Graph RAG systems. We design and execute tailored data extraction attacks to probe their susceptibility to leaking both raw text and structured data, such as entities and their relationships. Our findings reveal a critical trade-off: while Graph RAG systems may reduce raw text leakage, they are significantly more vulnerable to the extraction of structured entity and relationship information. We also explore potential defense mechanisms to mitigate these novel attack surfaces. This work provides a foundational analysis of the unique privacy challenges in Graph RAG and offers insights for building more secure systems.


Key Contributions

  • Tailored data extraction attacks designed for Graph RAG systems that probe leakage of both raw text and structured data (entities and relationships)
  • Empirical finding of a critical trade-off: Graph RAG reduces raw text leakage compared to standard RAG but is significantly more vulnerable to structured entity/relationship extraction
  • Foundational analysis of Graph RAG-specific privacy attack surfaces alongside exploration of potential defense mechanisms

🛡️ Threat Analysis


Details

Domains
nlpgraph
Model Types
llmgnn
Threat Tags
black_boxinference_timetargeted
Applications
graph rag systemsknowledge graph-augmented llmsquestion answering systems