Liming Fang

attack arXiv Aug 14, 2025 · Aug 2025

Jailbreaking Commercial Black-Box LLMs with Explicitly Harmful Prompts

Chiyu Zhang, Lu Zhou, Xiaogang Xu et al. · Nanjing University of Aeronautics and Astronautics · Collaborative Innovation Center of Novel Software Technology and Industrialization +2 more

Novel black-box jailbreak attack combining adversarial context alignment and fake chain-of-thought to bypass reasoning LLM safety guardrails

Prompt Injection nlp

PDF

Papers in Database (1)

Jailbreaking Commercial Black-Box LLMs with Explicitly Harmful Prompts