ML Security Papers

Latest papers

4 papers

defense arXiv Jan 31, 2026 · 9w ago

Ziyao Wang, Nizhang Li, Pingzhi Li et al. · College Park · Macau University of Science and Technology +1 more

Defends open-source LLMs against unauthorized fine-tuning by hiding a sparse subnetwork mask, degrading adaptation without the key

Transfer Learning Attack Model Theft nlp

benchmark arXiv Nov 26, 2025 · Nov 2025

Xinyu Liu, Xu Zhang, Can Chen et al. · Michigan State University · Illinois Institute of Technology +1 more

Uses Information Bottleneck theory to analyze backdoor training dynamics and proposes a model-level stealthiness metric for backdoor attacks

Model Poisoning vision

attack arXiv Sep 15, 2025 · Sep 2025

Yifan Lan, Yuanpu Cao, Weitong Zhang et al. · The Pennsylvania State University · The University of North Carolina at Chapel Hill

Gradient-optimized adversarial images hijack MLLM output preferences at inference time with transferable universal perturbations

Input Manipulation Attack Prompt Injection visionnlpmultimodal

defense arXiv Jan 5, 2025 · Jan 2025

Yang Ouyang, Hengrui Gu, Shuhang Lin et al. · North Carolina State University · Rutgers University +4 more

Defends LLMs against jailbreaks by identifying harmful-token-generating layers and patching them via adversarial unlearning

Prompt Injection nlp