Jing Shao

h-index: 2 20 citations 8 papers (total)

Papers in Database (4)

attack arXiv Oct 17, 2025 · Oct 2025

HarmRLVR: Weaponizing Verifiable Rewards for Harmful LLM Alignment

Yuexiao Liu, Lijun Li, Xingjun Wang et al. · Tsinghua University · Shanghai Artificial Intelligence Laboratory

Exploits RLVR fine-tuning with 64 harmful prompts to rapidly reverse LLM safety alignment at 96% attack success rate

Transfer Learning Attack nlp
1 citations 1 influentialPDF Code
attack arXiv Oct 13, 2025 · Oct 2025

Collaborative Shadows: Distributed Backdoor Attacks in LLM-Based Multi-Agent Systems

Pengyu Zhu, Lijun Li, Yaxing Lyu et al. · Beijing University of Posts and Telecommunications · Shanghai Artificial Intelligence Laboratory +2 more

Distributed backdoor attack on LLM multi-agent systems via tool-embedded primitives activated by agent collaboration sequences

Model Poisoning Insecure Plugin Design nlp
PDF Code
attack arXiv Sep 30, 2025 · Sep 2025

STaR-Attack: A Spatio-Temporal and Narrative Reasoning Attack Framework for Unified Multimodal Understanding and Generation Models

Shaoxiong Guo, Tianyi Du, Lijun Li et al. · Shanghai Artificial Intelligence Laboratory · East China Normal University +2 more

Multi-turn narrative jailbreak exploiting UMM generation-understanding coupling to bypass safety alignment via story framing

Prompt Injection multimodalnlpvision
PDF
attack arXiv Dec 2, 2025 · Dec 2025

Contextual Image Attack: How Visual Context Exposes Multimodal Safety Vulnerabilities

Yuan Xiong, Ziqi Miao, Lijun Li et al. · Shanghai Artificial Intelligence Laboratory · Xi’an Jiaotong University +1 more

Jailbreaks multimodal LLMs by embedding harmful queries in crafted visual contexts via a multi-agent image generation system

Prompt Injection visionmultimodalnlp
PDF