ML Security Papers

Latest papers

3 papers

attack arXiv Apr 18, 2026 · 4w ago

Yuheng Chen, Zhiyu Wu, Bowen Cheng et al. · Kagoshima University · Fudan University +1 more

Bypasses LLM safety alignment by reformulating harmful prompts as forced-choice questions where all options violate policies

Prompt Injection nlp

attack arXiv Mar 23, 2026 · 8w ago

Chengyin Hu, Yikun Guo, Yuxian Dong et al. · China University of Petroleum-Beijing · University of Electronic Science and Technology of China +3 more

Universal adversarial patch attack on infrared pedestrian detectors using parameterized Bézier curves and cold patches

Input Manipulation Attack vision

defense arXiv Mar 9, 2026 · 10w ago

Qishun Yang, Shu Yang, Lijie Hu et al. · King Abdullah University of Science and Technology · China University of Petroleum-Beijing +1 more

Defends VLMs against visual jailbreaks via label-free fine-tuning on neutral threat-image tasks to shape safety-oriented personas

Prompt Injection visionmultimodalnlp