Xin Zhao

defense arXiv Nov 12, 2025 · Nov 2025

Xin Zhao, Xiaojun Chen, Bingshan Liu et al. · Chinese Academy of Sciences · State Key Laboratory of Cyberspace Security Defense +1 more

Defends text-to-image models from jailbreak prompts via LLM-driven zero-shot prompt rewriting with cultural and intent-aware safety checks

Prompt Injection multimodalgenerativenlp

1 citations PDF

defense arXiv Nov 12, 2025 · Nov 2025

Yunfei Yang, Xiaojun Chen, Yuexin Xuan et al. · Chinese Academy of Sciences · State Key Laboratory of Cyberspace Security Defense +2 more

Embeds coupled watermarks in models that adversaries inevitably carry over when stealing via query-based extraction attacks

Model Theft vision

1 citations PDF Code

attack arXiv Oct 15, 2025 · Oct 2025

Xin Zhao, Xiaojun Chen, Bingshan Liu et al. · Chinese Academy of Sciences · State Key Laboratory of Cyberspace Security Defense +2 more

Backdoor attack exploiting MoE routing preferences in LLMs to hijack expert pathways with up to 100% attack success rate

Model Poisoning nlp

Papers in Database (3)