Zijian Zhang

h-index: 3 967 citations 7 papers (total)

Papers in Database (2)

attack arXiv Jan 9, 2026 · 12w ago

Jailbreaking Large Language Models through Iterative Tool-Disguised Attacks via Reinforcement Learning

Zhaoqi Wang, Zijian Zhang, Daqing He et al. · Beijing Institute of Technology · University of Auckland +2 more

Jailbreaks aligned LLMs by disguising malicious queries as tool calls and using RL to iteratively escalate response harmfulness across turns

Prompt Injection Insecure Plugin Design nlp
PDF
attack arXiv Sep 28, 2025 · Sep 2025

Formalization Driven LLM Prompt Jailbreaking via Reinforcement Learning

Zhaoqi Wang, Daqing He, Zijian Zhang et al. · Beijing Institute of Technology · Hefei University of Technology +1 more

Attacks LLM alignment with RL-driven formalization of jailbreak prompts combined with GraphRAG knowledge reuse

Prompt Injection nlp
PDF