Daqing He

h-index: 0 0 citations 2 papers (total)

Papers in Database (2)

attack arXiv Sep 28, 2025 · Sep 2025

Formalization Driven LLM Prompt Jailbreaking via Reinforcement Learning

Zhaoqi Wang, Daqing He, Zijian Zhang et al. · Beijing Institute of Technology · Hefei University of Technology +1 more

Attacks LLM alignment with RL-driven formalization of jailbreak prompts combined with GraphRAG knowledge reuse

Prompt Injection nlp
PDF
attack arXiv Jan 9, 2026 · 12w ago

Jailbreaking Large Language Models through Iterative Tool-Disguised Attacks via Reinforcement Learning

Zhaoqi Wang, Zijian Zhang, Daqing He et al. · Beijing Institute of Technology · University of Auckland +2 more

Jailbreaks aligned LLMs by disguising malicious queries as tool calls and using RL to iteratively escalate response harmfulness across turns

Prompt Injection Insecure Plugin Design nlp
PDF