Huizhen Shu

h-index: 1 2 citations 3 papers (total)

Papers in Database (1)

defense arXiv Sep 24, 2025 · Sep 2025

LatentGuard: Controllable Latent Steering for Robust Refusal of Attacks and Reliable Response Generation

Huizhen Shu, Xuying Li, Zhuo Li · hydrox.ai

Defends LLMs against jailbreaks via VAE-supervised latent steering that selectively suppresses adversarial signals while preserving utility

Prompt Injection nlp
PDF