Jiawei Lian

h-index: 10 287 citations 24 papers (total)

Papers in Database (2)

attack arXiv Sep 18, 2025 · Sep 2025

Semantic Representation Attack against Aligned Large Language Models

Jiawei Lian, Jianhong Pan, Lefan Wang et al. · The Hong Kong Polytechnic University · Northwestern Polytechnical University

Jailbreaks safety-aligned LLMs by targeting semantic representation space rather than exact affirmative token patterns

Prompt Injection nlp
1 citations PDF Code
attack arXiv Jan 20, 2026 · 10w ago

Diffusion-Guided Backdoor Attacks in Real-World Reinforcement Learning

Tairan Huang, Qingqing Ye, Yulin Jin et al. · The Hong Kong Polytechnic University

Diffusion-generated floor patch triggers bypass real-world safety control stacks to reliably activate backdoors in RL robot policies

Model Poisoning reinforcement-learningvision
PDF