Zhiyuan Yu

h-index: 13 586 citations 21 papers (total)

Papers in Database (1)

defense arXiv Dec 12, 2025 · Dec 2025

Rethinking Jailbreak Detection of Large Vision Language Models with Representational Contrastive Scoring

Peichun Hua, Hao Li, Shanghao Shi et al. · Washington University in St. Louis · Texas A&M University

Detects LVLM jailbreaks by contrastively scoring internal model representations, separating malicious from novel-benign inputs

Input Manipulation Attack Prompt Injection multimodalvisionnlp
PDF Code