Nuria Oliver

Papers in Database (1)

attack arXiv Aug 4, 2025 · Aug 2025

Large Reasoning Models Are Autonomous Jailbreak Agents

Thilo Hagendorff, Erik Derner, Nuria Oliver · University of Stuttgart · ELLIS Alicante

Demonstrates LRMs act as autonomous jailbreak agents against frontier LLMs, achieving 97% attack success via multi-turn persuasion

Prompt Injection nlp
PDF