Tat-Seng Chua

h-index: 2 6 citations 8 papers (total)

Papers in Database (2)

defense arXiv Jan 29, 2026 · 9w ago

TraceRouter: Robust Safety for Large Foundation Models via Path-Level Intervention

Chuancheng Shi, Shangze Li, Wenjun Lu et al. · The University of Sydney · Nanjing University of Science and Technology +2 more

Defends LLMs, diffusion models, and MLLMs from jailbreaks by tracing and severing harmful semantic circuits via sparse autoencoders and causal path analysis

Input Manipulation Attack Prompt Injection nlpvisionmultimodalgenerative

PDF

defense arXiv Feb 1, 2026 · 9w ago

Who Transfers Safety? Identifying and Targeting Cross-Lingual Shared Safety Neurons

Xianhui Zhang, Chengyu Xie, Linxia Zhu et al. · Nanjing University of Science and Technology · National University of Singapore +2 more

Identifies sparse cross-lingual safety neurons in LLMs and proposes targeted fine-tuning to close multilingual jailbreak safety gaps

Prompt Injection nlp

PDF Code