Miao Yu

Papers in Database (2)

defense arXiv Mar 24, 2026 · 13d ago

SafeSeek: Universal Attribution of Safety Circuits in Language Models

Miao Yu, Siyuan Fu, Moayad Aloqaily et al. · University of Science and Technology of China · Squirrel AI Learning +4 more

Mechanistic interpretability framework identifying sparse safety circuits in LLMs for backdoor removal and alignment preservation

Model Poisoning Input Manipulation Attack Prompt Injection nlp
PDF
attack arXiv Aug 4, 2025 · Aug 2025

Hidden in the Noise: Unveiling Backdoors in Audio LLMs Alignment through Latent Acoustic Pattern Triggers

Liang Lin, Miao Yu, Kaiwen Luo et al. · Chinese Academy of Sciences · University of Science and Technology of China +4 more

Backdoor attack on Audio LLMs using acoustic triggers like noise and speech rate achieves >90% ASR at just 3% poisoning ratio

Model Poisoning audionlp
PDF Code