Xinhe Wang

attack arXiv Apr 27, 2026 · 24d ago

Xinhe Wang, Katia Sycara, Yaqi Xie · Carnegie Mellon University

Multi-turn jailbreaking attack that deceives LLM safety by simulating benign intent across conversations to elicit harmful outputs

Prompt Injection nlpmultimodal

Papers in Database (1)