Jing Liu

Papers in Database (2)

defense arXiv Mar 16, 2026 · 21d ago

Directional Embedding Smoothing for Robust Vision Language Models

Ye Wang, Jing Liu, Toshiaki Koike-Akino · Mitsubishi Electric Research Laboratories

Extends RESTA defense to VLMs using directional embedding noise to reduce jailbreak success rates on JailBreakV-28K benchmark

Input Manipulation Attack Prompt Injection multimodalnlpvision
PDF
attack arXiv Mar 16, 2026 · 21d ago

Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities

Vanshaj Khattar, Md Rafi ur Rashid, Moumita Choudhury et al. · Virginia Tech · Penn State University +2 more

Jailbreak injection during test-time RL amplifies LLM harmful outputs and degrades reasoning performance simultaneously

Prompt Injection Training Data Poisoning nlp
PDF