Xuan Xie

h-index: 0 0 citations 0 papers (total)

Papers in Database (1)

defense arXiv Feb 14, 2026 · 7w ago

AISA: Awakening Intrinsic Safety Awareness in Large Language Models against Jailbreak Attacks

Weiming Song, Xuan Xie, Ruiping Yin · Beijing University of Technology · Macau University of Science and Technology

Defends LLMs against jailbreaks by extracting safety signals from attention heads and steering logits without fine-tuning

Prompt Injection nlp
PDF