Shicheng Liu

h-index: 2 14 citations 7 papers (total)

Papers in Database (1)

defense arXiv Jan 31, 2026 · 9w ago

Steering to Say No: Configurable Refusal via Activation Steering in Vision Language Models

Jiaxi Yang, Shicheng Liu, Yuchen Yang et al. · The Pennsylvania State University

Proposes activation steering-based configurable refusal for VLMs that adaptively balances under- and over-refusal

Prompt Injection visionnlpmultimodal
PDF