ML Security Papers

Latest papers

3 papers

defense arXiv Apr 18, 2026 · 4w ago

Jiayuan Liu, Shiyi Du, Weihua Du et al. · Carnegie Mellon University · Foundations of Cooperative AI Lab +1 more

Token-level collaborative generation defends multi-agent LLM systems against prompt injection attacks that corrupt majority of agents

Prompt Injection nlp

attack arXiv Mar 16, 2026 · 9w ago

Zhenlin Xu, Xiaogang Zhu, Yu Yao et al. · Adelaide University · The University of Sydney +1 more

Memory poisoning attack on LLM agents that hijacks tool selection control flow across tasks via malicious memory retrieval

Prompt Injection Excessive Agency nlp

defense arXiv Mar 13, 2026 · 9w ago

Zhifang Zhang, Bojun Yang, Shuo He et al. · Southeast University · Nanyang Technological University +2 more

Test-time backdoor defense for LVLMs that detects poisoned inputs via cross-modal attention anomalies and purifies them by pruning trigger tokens

Model Poisoning multimodalnlpvision