ML Security Papers

Latest papers

4 papers

benchmark arXiv Feb 10, 2026 · 7w ago

Zhisheng Qi, Utkarsh Sahu, Li Ma et al. · University of Oregon · Michigan State University +6 more

First systematic benchmark comparing knowledge-extraction attacks and defenses on RAG systems under unified evaluation protocols

Sensitive Information Disclosure nlp

attack arXiv Jan 30, 2026 · 9w ago

Jiate Li, Defu Cao, Li Li et al. · University of Southern California · Adobe Research +1 more

Black-box query-agnostic adversarial token injection attack manipulates document rankings in RAG and LLM-based retrieval systems using surrogate LLMs

Input Manipulation Attack Prompt Injection nlp

1 citations PDF

defense ACM MM Oct 3, 2025 · Oct 2025

Naresh Kumar Devulapally, Shruti Agarwal, Tejas Gokhale et al. · The State University of New York · Adobe Research +1 more

Defends user images from unauthorized diffusion model personalization via imperceptible latent-space trajectory-shifted poisoning perturbations

Data Poisoning Attack Output Integrity Attack visiongenerative

defense arXiv Sep 11, 2025 · Sep 2025

Mohsen Fayyaz, Ali Modarressi, Hanieh Deilamsalehy et al. · University of California · Adobe Research +2 more

Manipulates MoE expert routing at inference time to steer LLM safety, achieving -100% safety when combined with jailbreaks

Prompt Injection nlp