ML Security Papers

Latest papers

4 papers

attack arXiv Dec 2, 2025 · Dec 2025

Yuanhe Zhang, Weiliu Wang, Zhenhong Zhou et al. · Beijing University of Posts and Telecommunications · Hangzhou Dianzi University +4 more

LeechHijack backdoors MCP tools to covertly parasitize LLM agent compute via runtime C2 channel, achieving 77% success undetected

Insecure Plugin Design nlp

1 citations PDF

defense arXiv Oct 30, 2025 · Oct 2025

Yingjia Wang, Ting Qiao, Xing Liu et al. · North China Electric Power University · China Unicom +1 more

Embeds sample-specific backdoor watermarks in training data to prove dataset ownership via black-box model testing

Output Integrity Attack vision

1 citations 1 influentialPDF

defense arXiv Oct 17, 2025 · Oct 2025

Ting Qiao, Xing Liu, Wenke Huang et al. · North China Electric Power University · China Unicom +3 more

Certifiably robust training-data watermarking for PLMs using dual-space smoothing to verify dataset ownership under adversarial perturbations

Output Integrity Attack nlp

1 citations PDF Code

attack arXiv Oct 13, 2025 · Oct 2025

Pengyu Zhu, Lijun Li, Yaxing Lyu et al. · Beijing University of Posts and Telecommunications · Shanghai Artificial Intelligence Laboratory +2 more

Distributed backdoor attack on LLM multi-agent systems via tool-embedded primitives activated by agent collaboration sequences

Model Poisoning Insecure Plugin Design nlp