Xin Shen

h-index: 1 2 citations 5 papers (total)

Papers in Database (2)

attack arXiv Dec 26, 2025 · Dec 2025

Few Tokens Matter: Entropy Guided Attacks on Vision-Language Models

Mengqi He, Xinyu Tian, Xin Shen et al. · Australian National University · The University of Queensland +1 more

Targets high-entropy VLM decoding positions with adversarial visual perturbations, converting 35-49% of benign outputs to harmful content at 93-95% attack success rate

Input Manipulation Attack Prompt Injection visionnlpmultimodal

PDF

defense arXiv Jan 29, 2026 · 9w ago

Stay in Character, Stay Safe: Dual-Cycle Adversarial Self-Evolution for Safety Role-Playing Agents

Mingyang Liao, Yichen Wan, shuchen wu et al. · Baidu Inc. · The University of Queensland +1 more

Training-free dual-cycle framework defends LLM role-playing agents against jailbreaks while preserving persona fidelity via evolving hierarchical knowledge

Prompt Injection nlp

PDF Code