Yaojie Lu

Papers in Database (1)

attack arXiv Jan 3, 2025 · Jan 2025

Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models

Yanjiang Liu, Shuhen Zhou, Yaojie Lu et al. · Institute of Software · University of Chinese Academy of Sciences +1 more

RL-based automated red-teaming framework that optimizes jailbreak strategies against LLMs, achieving 16.63% higher attack success rates

Prompt Injection nlp
PDF