Zheyuan Liu

defense arXiv Sep 27, 2025 · Sep 2025

Han Yan, Zheyuan Liu, Meng Jiang · University of Notre Dame · The Chinese University of Hong Kong

Defends LLM unlearning against jailbreak and relearning attacks via dual-space smoothness in representation and parameter spaces

Prompt Injection Sensitive Information Disclosure nlp

1 citations PDF Code

benchmark arXiv Jan 11, 2026 · 12w ago

Zheyuan Liu, Dongwhi Kim, Yixin Wan et al. · University of Notre Dame · University of California +2 more

Benchmarks multimodal LLM contextual safety against escalating and context-switch jailbreaks across 15 models and 5 guardrails

Prompt Injection multimodalnlpvision

Papers in Database (2)