ML Security Papers

Latest papers

2 papers

attack arXiv Nov 22, 2025 · Nov 2025

Jiayi Luo, Qingyun Sun, Lingjuan Lyu et al. · Beihang University · Sony AI +1 more

Backdoor attack on Graph Foundation Models with label-free triggers and fine-tuning-resistant anchoring for persistence

Model Poisoning Transfer Learning Attack graph

1 citations PDF

defense arXiv Oct 6, 2025 · Oct 2025

Zizhao Wang, Dingcheng Li, Vaishakh Keshava et al. · Google · The University of Texas at Austin +2 more

Defends LLM tool-using agents from indirect prompt injection via adversarial RL co-training in a two-player zero-sum game

Prompt Injection nlpreinforcement-learning

3 citations PDF