Marianne Menglin Liu

h-index: 0 0 citations 5 papers (total)

Papers in Database (1)

benchmark arXiv Oct 6, 2025 · Oct 2025

RAG Makes Guardrails Unsafe? Investigating Robustness of Guardrails under RAG-style Contexts

Yining She, Daniel W. Peterson, Marianne Menglin Liu et al. · Carnegie Mellon University · Oracle Cloud Infrastructure +1 more

Benign RAG-retrieved documents flip LLM safety guardrail judgments ~11% of the time, exposing a context-robustness gap attackers could exploit

Prompt Injection nlp
PDF