Xuying li

attack arXiv Oct 29, 2025 · Oct 2025

André V. Duarte, Xuying li, Bin Zeng et al. · Carnegie Mellon University · Instituto Superior Técnico +1 more

Agentic feedback-loop pipeline extracts memorized copyrighted books from LLMs, improving ROUGE-L by 24% over single-pass extraction

Model Inversion Attack Sensitive Information Disclosure nlp

defense arXiv Sep 24, 2025 · Sep 2025

Huizhen Shu, Xuying Li, Zhuo Li · hydrox.ai

Defends LLMs against jailbreaks via VAE-supervised latent steering that selectively suppresses adversarial signals while preserving utility

Prompt Injection nlp

Papers in Database (2)