ML Security Papers

Latest papers

3 papers

attack arXiv Mar 14, 2026 · 23d ago

Zijian Ling, Pingyi Hu, Xiuyong Gao et al. · Huazhong University of Science and Technology · Tsinghua University +1 more

Inaudible near-ultrasonic acoustic channel attack that delivers jailbreak prompts to speech-driven LLMs through commodity hardware

Input Manipulation Attack Prompt Injection nlpaudiomultimodal

defense arXiv Oct 6, 2025 · Oct 2025

Santhosh KumarRavindran · Microsoft Corporation

Activation-patching framework detecting and mitigating prompt injection, deception, and bias in enterprise LLMs with 92% injection detection accuracy

Prompt Injection Excessive Agency nlp

attack arXiv Aug 8, 2025 · Aug 2025

Haorui He, Yupeng Li, Bin Benjamin Zhu et al. · Hong Kong Baptist University · The University of Hong Kong +1 more

Poisons RAG knowledge bases of LLM fact-checkers by mimicking claim decomposition and exploiting justifications to craft targeted malicious evidence

Data Poisoning Attack Prompt Injection nlp