Shenzhe Zhu

attack arXiv Dec 9, 2025 · Dec 2025

Shenzhe Zhu · University of Toronto

Multi-agent debate framework that transforms explicit harmful queries into stealthy variants that evade LLM safety mechanisms

Prompt Injection nlp

Papers in Database (1)