ML Security Papers

ML Security Papers

Latest papers

1 papers

attack arXiv Sep 29, 2025 · Sep 2025

VISOR++: Universal Visual Inputs based Steering for Large Vision Language Models

Ravikumar Balakrishnan, Mansi Phute · HiddenLayer Inc. · Georgia Institute of Technology

Optimizes adversarial images that steer VLM alignment behaviors like refusal and sycophancy without runtime model internals access

Input Manipulation Attack Prompt Injection visionnlpmultimodal

1 citations PDF