Weiming Zhang

attack arXiv Sep 21, 2025 · Sep 2025

Xingkai Peng, Jun Jiang, Meng Tong et al. · University of Science and Technology of China

Multimodal jailbreak attack on T2I safety filters by decoupling unsafe prompts into image-guided adversarial text components

Prompt Injection visionnlpmultimodalgenerative

1 citations PDF

defense arXiv Sep 26, 2025 · Sep 2025

Jiawei Zhao, Yuang Qi, Weiming Zhang et al. · University of Science and Technology of China

Efficient LRM guard model replaces slow reasoning traces with prefilled tokens to detect jailbreaks in one forward pass

Prompt Injection nlp

Papers in Database (2)