Handing Wang

Papers in Database (3)

defense arXiv Feb 26, 2026 · 5w ago

Multilingual Safety Alignment Via Sparse Weight Editing

Jiaming Liang, Zhaoxin Wang, Handing Wang · Xidian University

Training-free sparse weight editing transfers LLM safety alignment from high-resource to low-resource languages to block cross-lingual jailbreaks

Prompt Injection nlp
PDF
defense arXiv Mar 23, 2026 · 14d ago

DTVI: Dual-Stage Textual and Visual Intervention for Safe Text-to-Image Generation

Binhong Tan, Zhaoxin Wang, Handing Wang · Xidian University

Dual-stage defense blocking unsafe image generation via sequence-level prompt intervention and visual-stage filtering across multiple harmful categories

Input Manipulation Attack Prompt Injection visionnlpmultimodalgenerative
PDF
defense arXiv Feb 12, 2026 · 7w ago

SafeNeuron: Neuron-Level Safety Alignment for Large Language Models

Zhaoxin Wang, Jiaming Liang, Fengbin Zhu et al. · Xidian University · National University of Singapore +1 more

Defends LLM safety alignment against neuron pruning attacks by redistributing safety representations across the network via selective neuron freezing

Prompt Injection nlpmultimodal
PDF