Guangdong Bai

Papers in Database (1)

attack arXiv Sep 8, 2025 · Sep 2025

Embedding Poisoning: Bypassing Safety Alignment via Embedding Semantic Shift

Shuai Yuan, Zhibo Zhang, Yuxi Li et al. · University of Electronic Science and Technology of China · Huazhong University of Science and Technology +1 more

Injects adversarial perturbations into LLM embedding outputs at inference time to bypass safety alignment without modifying weights or prompts

Input Manipulation Attack Prompt Injection nlp
PDF