Bingjie Zhang

h-index: 2 12 citations 3 papers (total)

Papers in Database (1)

defense arXiv Oct 16, 2025 · Oct 2025

A Guardrail for Safety Preservation: When Safety-Sensitive Subspace Meets Harmful-Resistant Null-Space

Bingjie Zhang, Yibo Yang, Zhe Ren et al. · Jilin University · King Abdullah University of Science and Technology +1 more

Defends LLM safety alignment during fine-tuning by freezing safety-relevant weight subspaces and projecting adapter updates into a harmful-resistant null space

Transfer Learning Attack Prompt Injection nlp
3 citations PDF