Furong Huang

h-index: 11 469 citations 25 papers (total)

Papers in Database (1)

defense arXiv Feb 11, 2026 · 7w ago

Safety Recovery in Reasoning Models Is Only a Few Early Steering Steps Away

Soumya Suvra Ghosal, Souradip Chakraborty, Vaibhav Singh et al. · College Park · IIT Bombay +1 more

Inference-time defense for multimodal reasoning VLMs that monitors reasoning traces and steers safety within 1-3 steps to cut jailbreak ASR by 30-60%

Input Manipulation Attack Prompt Injection multimodalnlp
PDF