Sanket Mendapara

Papers in Database (1)

attack arXiv Apr 28, 2026 · 23d ago

One Perturbation, Two Failure Modes: Probing VLM Safety via Embedding-Guided Typographic Perturbations

Ravikumar Balakrishnan, Sanket Mendapara · Cisco Systems

Adversarial visual perturbations that bypass VLM safety filters via embedding-guided typographic optimization, exploiting both readability and alignment weaknesses

Input Manipulation Attack Prompt Injection visionnlpmultimodal
PDF