ML Security Papers

Latest papers

4 papers

defense arXiv Mar 19, 2026 · 18d ago

Lu Yu, Haiyang Zhang, Changsheng Xu · Tianjin University of Technology · Chinese Academy of Sciences +1 more

Defends CLIP against adversarial examples using complementary text-guided attention to maintain zero-shot generalization while improving robustness

Input Manipulation Attack visionnlpmultimodal

attack arXiv Jan 30, 2026 · 9w ago

Manyi Li, Yufan Liu, Lai Jiang et al. · University of the Chinese Academy of Sciences · Chinese Academy of Sciences +2 more

Attacks machine unlearning defenses in diffusion models by optimizing initial latent variables to reactivate erased NSFW knowledge

Input Manipulation Attack visiongenerative

benchmark arXiv Jan 30, 2026 · 9w ago

Enyi Shi, Pengyang Shao, Yanxin Zhang et al. · Nanjing University of Science and Technology · National University of Singapore +3 more

Multilingual multimodal safety benchmark revealing cross-lingual asymmetries in VLLM jailbreak susceptibility across 10 languages and 11 models

Prompt Injection multimodalnlp

attack arXiv Sep 8, 2025 · Sep 2025

Junjie Mu, Zonghao Ying, Zhekui Fan et al. · Beihang University · 360 AI Security Lab +4 more

Identifies redundant tokens in GCG adversarial suffixes via learnable masking, reducing LLM jailbreak attack time by 16.8%.

Input Manipulation Attack Prompt Injection nlp