Shenzhe Zhu

h-index: 4 72 citations 6 papers (total)

Papers in Database (1)

attack arXiv Dec 9, 2025 · Dec 2025

HarmTransform: Transforming Explicit Harmful Queries into Stealthy via Multi-Agent Debate

Shenzhe Zhu · University of Toronto

Multi-agent debate framework that transforms explicit harmful queries into stealthy variants that evade LLM safety mechanisms

Prompt Injection nlp
PDF