Latest papers

1 papers
defense arXiv Mar 28, 2026 · 9d ago

Diagnosing and Repairing Unsafe Channels in Vision-Language Models via Causal Discovery and Dual-Modal Safety Subspace Projection

Jinhu Fu, Yihang Lou, Qingyi Si et al. · Beijing University of Posts and Telecommunications · Chongqing University of Posts and Telecommunications +2 more

Identifies and repairs unsafe neural pathways in VLMs using causal mediation analysis and dual-modal safety subspace projection

Input Manipulation Attack Prompt Injection multimodalvisionnlp
PDF