Lefan Wang

attack arXiv Sep 18, 2025 · Sep 2025

Jiawei Lian, Jianhong Pan, Lefan Wang et al. · The Hong Kong Polytechnic University · Northwestern Polytechnical University

Jailbreaks safety-aligned LLMs by targeting semantic representation space rather than exact affirmative token patterns

Prompt Injection nlp

1 citations PDF Code

Papers in Database (1)