Weiming Zhang

h-index: 10 737 citations 59 papers (total)

Papers in Database (2)

attack arXiv Sep 21, 2025 · Sep 2025

Multimodal Prompt Decoupling Attack on the Safety Filters in Text-to-Image Models

Xingkai Peng, Jun Jiang, Meng Tong et al. · University of Science and Technology of China

Multimodal jailbreak attack on T2I safety filters by decoupling unsafe prompts into image-guided adversarial text components

Prompt Injection visionnlpmultimodalgenerative
1 citations PDF
defense arXiv Sep 26, 2025 · Sep 2025

PSRT: Accelerating LRM-based Guard Models via Prefilled Safe Reasoning Traces

Jiawei Zhao, Yuang Qi, Weiming Zhang et al. · University of Science and Technology of China

Efficient LRM guard model replaces slow reasoning traces with prefilled tokens to detect jailbreaks in one forward pass

Prompt Injection nlp
PDF