Xuehai Tang

Papers in Database (2)

defense arXiv Apr 24, 2026 · 27d ago

RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents

Wenjie Xiao, Xuehai Tang, Biyu Zhou et al. · University of Chinese Academy of Sciences · Chinese Academy of Sciences

Detects poisoned LLM agent skills by identifying attention hijacking patterns where malicious instructions redirect model reasoning

Prompt Injection Excessive Agency nlp
PDF
attack arXiv Aug 4, 2025 · Aug 2025

Hidden in the Noise: Unveiling Backdoors in Audio LLMs Alignment through Latent Acoustic Pattern Triggers

Liang Lin, Miao Yu, Kaiwen Luo et al. · Chinese Academy of Sciences · University of Science and Technology of China +4 more

Backdoor attack on Audio LLMs using acoustic triggers like noise and speech rate achieves >90% ASR at just 3% poisoning ratio

Model Poisoning audionlp
PDF Code