Latest papers

4 papers
attack arXiv Dec 2, 2025 · Dec 2025

LeechHijack: Covert Computational Resource Exploitation in Intelligent Agent Systems

Yuanhe Zhang, Weiliu Wang, Zhenhong Zhou et al. · Beijing University of Posts and Telecommunications · Hangzhou Dianzi University +4 more

LeechHijack backdoors MCP tools to covertly parasitize LLM agent compute via runtime C2 channel, achieving 77% success undetected

Insecure Plugin Design nlp
1 citations PDF
defense arXiv Oct 30, 2025 · Oct 2025

SSCL-BW: Sample-Specific Clean-Label Backdoor Watermarking for Dataset Ownership Verification

Yingjia Wang, Ting Qiao, Xing Liu et al. · North China Electric Power University · China Unicom +1 more

Embeds sample-specific backdoor watermarks in training data to prove dataset ownership via black-box model testing

Output Integrity Attack vision
1 citations 1 influentialPDF
defense arXiv Oct 17, 2025 · Oct 2025

DSSmoothing: Toward Certified Dataset Ownership Verification for Pre-trained Language Models via Dual-Space Smoothing

Ting Qiao, Xing Liu, Wenke Huang et al. · North China Electric Power University · China Unicom +3 more

Certifiably robust training-data watermarking for PLMs using dual-space smoothing to verify dataset ownership under adversarial perturbations

Output Integrity Attack nlp
1 citations PDF Code
attack arXiv Oct 13, 2025 · Oct 2025

Collaborative Shadows: Distributed Backdoor Attacks in LLM-Based Multi-Agent Systems

Pengyu Zhu, Lijun Li, Yaxing Lyu et al. · Beijing University of Posts and Telecommunications · Shanghai Artificial Intelligence Laboratory +2 more

Distributed backdoor attack on LLM multi-agent systems via tool-embedded primitives activated by agent collaboration sequences

Model Poisoning Insecure Plugin Design nlp
PDF Code