Mingjie Li

h-index: 5 101 citations 7 papers (total)

Papers in Database (3)

defense ICLR Jan 3, 2025 · Jan 2025

SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation

Mingjie Li, Wai Man Si, Michael Backes et al. · CISPA Helmholtz Center for Information Security · Peking University

Defends LLM safety alignment from LoRA fine-tuning degradation via a fixed safety module and task-specific adapter initialization

Transfer Learning Attack Prompt Injection nlp
39 citations 8 influentialPDF
attack arXiv Oct 24, 2025 · Oct 2025

Adjacent Words, Divergent Intents: Jailbreaking Large Language Models via Task Concurrency

Yukun Jiang, Mingjie Li, Michael Backes et al. · CISPA Helmholtz Center for Information Security

Jailbreaks LLMs by interleaving harmful and benign task words, hiding malicious intent from safety guardrails with 95% attack success rate

Prompt Injection nlp
9 citations 1 influentialPDF Code
attack arXiv Feb 9, 2026 · 8w ago

Sparse Models, Sparse Safety: Unsafe Routes in Mixture-of-Experts LLMs

Yukun Jiang, Hai Huang, Mingjie Li et al. · CISPA Helmholtz Center for Information Security

Discovers unsafe routing configurations in MoE LLMs that bypass safety alignment, achieving 0.98 ASR on AdvBench via router optimization

Prompt Injection nlp
PDF Code