Kejiang Chen

h-index: 21 1,565 citations 79 papers (total)

Papers in Database (6)

attack arXiv Sep 21, 2025 · Sep 2025

Multimodal Prompt Decoupling Attack on the Safety Filters in Text-to-Image Models

Xingkai Peng, Jun Jiang, Meng Tong et al. · University of Science and Technology of China

Multimodal jailbreak attack on T2I safety filters by decoupling unsafe prompts into image-guided adversarial text components

Prompt Injection visionnlpmultimodalgenerative
1 citations PDF
defense arXiv Oct 18, 2025 · Oct 2025

EditMark: Watermarking Large Language Models based on Model Editing

Shuai Li, Kejiang Chen, Jun Jiang et al. · University of Science and Technology of China · A*STAR +1 more

Embeds 32-bit ownership watermarks into LLM weights via model editing in 20 seconds, enabling copyright verification without training costs

Model Theft Model Theft nlp
PDF
attack arXiv Oct 7, 2025 · Oct 2025

Membership Inference Attacks on Tokenizers of Large Language Models

Meng Tong, Yuntao Du, Kejiang Chen et al. · University of Science and Technology of China · Purdue University

Exploits LLM tokenizers as a new membership inference attack vector, achieving AUC 0.771 against state-of-the-art LLM tokenizers

Membership Inference Attack nlp
PDF
defense arXiv Jan 28, 2026 · 9w ago

SemBind: Binding Diffusion Watermarks to Semantics Against Black-Box Forgery Attacks

Xin Zhang, Zijin Yang, Kejiang Chen et al. · University of Science and Technology of China

Defends diffusion model image watermarks from black-box forgery by semantically binding latent signals via contrastive learning

Output Integrity Attack visiongenerative
PDF
benchmark arXiv Jan 29, 2026 · 9w ago

WMVLM: Evaluating Diffusion Model Image Watermarking via Vision-Language Models

Zijin Yang, Yu Sun, Kejiang Chen et al. · University of Science and Technology of China · Anhui Province Key Laboratory of Digital Security +1 more

Proposes a unified VLM-based benchmark for evaluating residual and semantic watermarks in diffusion model image outputs

Output Integrity Attack visiongenerative
PDF
defense arXiv Sep 26, 2025 · Sep 2025

PSRT: Accelerating LRM-based Guard Models via Prefilled Safe Reasoning Traces

Jiawei Zhao, Yuang Qi, Weiming Zhang et al. · University of Science and Technology of China

Efficient LRM guard model replaces slow reasoning traces with prefilled tokens to detect jailbreaks in one forward pass

Prompt Injection nlp
PDF