Xin Zhao

h-index: 3 13 citations 10 papers (total)

Papers in Database (3)

defense arXiv Nov 12, 2025 · Nov 2025

Value-Aligned Prompt Moderation via Zero-Shot Agentic Rewriting for Safe Image Generation

Xin Zhao, Xiaojun Chen, Bingshan Liu et al. · Chinese Academy of Sciences · State Key Laboratory of Cyberspace Security Defense +1 more

Defends text-to-image models from jailbreak prompts via LLM-driven zero-shot prompt rewriting with cultural and intent-aware safety checks

Prompt Injection multimodalgenerativenlp
1 citations PDF
defense arXiv Nov 12, 2025 · Nov 2025

DeepTracer: Tracing Stolen Model via Deep Coupled Watermarks

Yunfei Yang, Xiaojun Chen, Yuexin Xuan et al. · Chinese Academy of Sciences · State Key Laboratory of Cyberspace Security Defense +2 more

Embeds coupled watermarks in models that adversaries inevitably carry over when stealing via query-based extraction attacks

Model Theft vision
1 citations PDF Code
attack arXiv Oct 15, 2025 · Oct 2025

Who Speaks for the Trigger? Dynamic Expert Routing in Backdoored Mixture-of-Experts Transformers

Xin Zhao, Xiaojun Chen, Bingshan Liu et al. · Chinese Academy of Sciences · State Key Laboratory of Cyberspace Security Defense +2 more

Backdoor attack exploiting MoE routing preferences in LLMs to hijack expert pathways with up to 100% attack success rate

Model Poisoning nlp
PDF